Review 카테고리

[논문리뷰] VLA-4D: Embedding 4D Awareness into Vision-Language-Action Models for SpatioTemporally Coherent Robotic Manipulation

Gim Hee Lee이 [arXiv]에 게시한 'VLA-4D: Embedding 4D Awareness into Vision-Language-Action Models for SpatioTemporally Coherent Robotic Manipulation' 논문에 대한 자세한 리뷰입니다.
#Review#Vision-Language-Action Models#Robotic Manipulation#SpatioTemporal Coherence#4D Awareness#Visual Representation#Action Representation#Cross-Attention

[논문리뷰] SAM 3: Segment Anything with Concepts

이 [arXiv]에 게시한 'SAM 3: Segment Anything with Concepts' 논문에 대한 자세한 리뷰입니다.
#Review#Segment Anything Model#Open-Vocabulary Segmentation#Multimodal Foundation Model#Instance Segmentation#Video Object Tracking#Prompt Engineering#Data Engine#Human-in-the-loop

[논문리뷰] RynnVLA-002: A Unified Vision-Language-Action and World Model

이 [arXiv]에 게시한 'RynnVLA-002: A Unified Vision-Language-Action and World Model' 논문에 대한 자세한 리뷰입니다.
#Review#Vision-Language-Action (VLA) Model#World Model#Robotics#Unified Framework#Multi-modal Learning#Action Generation#Attention Mask#Continuous Control

[논문리뷰] Loomis Painter: Reconstructing the Painting Process

이 [arXiv]에 게시한 'Loomis Painter: Reconstructing the Painting Process' 논문에 대한 자세한 리뷰입니다.
#Review#Painting Process Generation#Video Diffusion Models#Media Transfer#Reverse Painting#Dataset Curation#Perceptual Distance Profile#Artistic Workflow#Generative AI

[논문리뷰] Insights from the ICLR Peer Review and Rebuttal Process

Nedjma Ousidhoum이 [arXiv]에 게시한 'Insights from the ICLR Peer Review and Rebuttal Process' 논문에 대한 자세한 리뷰입니다.
#Review#Peer Review#Rebuttal Process#ICLR#Score Dynamics#LLM Analysis#Reviewer Engagement#Academic Publishing#OpenReview

[논문리뷰] Diversity Has Always Been There in Your Visual Autoregressive Models

Yaxing Wang이 [arXiv]에 게시한 'Diversity Has Always Been There in Your Visual Autoregressive Models' 논문에 대한 자세한 리뷰입니다.
#Review#Visual Autoregressive Models#Diversity Collapse#Generative Diversity#Soft-Suppression Regularization#Soft-Amplification Regularization#Training-Free#Image Generation#Singular Value Decomposition

[논문리뷰] Step-Audio-R1 Technical Report

이 [arXiv]에 게시한 'Step-Audio-R1 Technical Report' 논문에 대한 자세한 리뷰입니다.
#Review#Audio Reasoning#Multimodal LLMs#Modality-Grounded Reasoning Distillation (MGRD)#Chain-of-Thought#Reinforcement Learning#Audio Understanding#Self-Distillation

[논문리뷰] Scaling Spatial Intelligence with Multimodal Foundation Models

이 [arXiv]에 게시한 'Scaling Spatial Intelligence with Multimodal Foundation Models' 논문에 대한 자세한 리뷰입니다.
#Review#Spatial Intelligence#Multimodal Foundation Models#Data Scaling#Perspective-taking#Visual Question Answering#Emergent Capabilities#Embodied AI#Benchmark Evaluation

[논문리뷰] SAM 3D: 3Dfy Anything in Images

이 [arXiv]에 게시한 'SAM 3D: 3Dfy Anything in Images' 논문에 대한 자세한 리뷰입니다.
#Review#3D Reconstruction#Generative Models#Single Image 3D#Object Reconstruction#Scene Understanding#Data Engine#Model-in-the-Loop#Human Preference

[논문리뷰] PartUV: Part-Based UV Unwrapping of 3D Meshes

Hao Su이 [arXiv]에 게시한 'PartUV: Part-Based UV Unwrapping of 3D Meshes' 논문에 대한 자세한 리뷰입니다.
#Review#UV Unwrapping#3D Meshes#Part-Based Decomposition#Neural Fields#Geometric Heuristics#Parameterization#Texture Mapping

[논문리뷰] MiMo-Embodied: X-Embodied Foundation Model Technical Report

이 [arXiv]에 게시한 'MiMo-Embodied: X-Embodied Foundation Model Technical Report' 논문에 대한 자세한 리뷰입니다.
#Review#Vision-Language Model (VLM)#Embodied AI#Autonomous Driving#Foundation Model#Multimodal Learning#Task Planning#Affordance Prediction#Spatial Understanding#Reinforcement Learning

[논문리뷰] Draft and Refine with Visual Experts

이 [arXiv]에 게시한 'Draft and Refine with Visual Experts' 논문에 대한 자세한 리뷰입니다.
#Review#Large Vision-Language Models (LVLMs)#Visual Grounding#Hallucination Mitigation#Agent Framework#Visual Question Answering (VQA)#Expert Coordination#Relevance Map#Multi-modal Reasoning

[논문리뷰] VisPlay: Self-Evolving Vision-Language Models from Images

이 [arXiv]에 게시한 'VisPlay: Self-Evolving Vision-Language Models from Images' 논문에 대한 자세한 리뷰입니다.
#Review#Self-Evolving#Vision-Language Models#Reinforcement Learning#Self-Play#Unlabeled Data#Multimodal Reasoning#Group Relative Policy Optimization#Hallucination Mitigation

[논문리뷰] Medal S: Spatio-Textual Prompt Model for Medical Segmentation

Tao Chen이 [arXiv]에 게시한 'Medal S: Spatio-Textual Prompt Model for Medical Segmentation' 논문에 대한 자세한 리뷰입니다.
#Review#Medical Segmentation#Foundation Model#Spatio-Textual Prompts#3D Convolution#Multi-modal Imaging#Dynamic Resampling#Parallel Inference#Iterative Refinement

[논문리뷰] MHR: Momentum Human Rig

Chris Twigg이 [arXiv]에 게시한 'MHR: Momentum Human Rig' 논문에 대한 자세한 리뷰입니다.
#Review#Parametric Body Model#Human Animation#Character Rigging#Pose Correctives#Skeletal Decoupling#Computer Graphics#AR/VR

[논문리뷰] Aligning Generative Music AI with Human Preferences: Methods and Challenges

Abhinaba Roy이 [arXiv]에 게시한 'Aligning Generative Music AI with Human Preferences: Methods and Challenges' 논문에 대한 자세한 리뷰입니다.
#Review#Generative Music AI#Preference Alignment#Reinforcement Learning from Human Feedback (RLHF)#Direct Preference Optimization (DPO)#Inference-Time Optimization#Music Generation#Human-Computer Interaction

[논문리뷰] Φeat: Physically-Grounded Feature Representation

이 [arXiv]에 게시한 'Φeat: Physically-Grounded Feature Representation' 논문에 대한 자세한 리뷰입니다.
#Review#Self-supervised Learning#Physically-Grounded Features#Material Representation#Intrinsic Scene Understanding#Vision Transformer#Synthetic Data#Contrastive Learning

[논문리뷰] VIDEOP2R: Video Understanding from Perception to Reasoning

이 [arXiv]에 게시한 'VIDEOP2R: Video Understanding from Perception to Reasoning' 논문에 대한 자세한 리뷰입니다.
#Review#Video Understanding#Reinforcement Fine-Tuning (RFT)#Large Video Language Models (LVLMs)#Perception and Reasoning#Chain-of-Thought (CoT)#Process-Aware Learning#Policy Optimization#Credit Assignment

[논문리뷰] REVISOR: Beyond Textual Reflection, Towards Multimodal Introspective Reasoning in Long-Form Video Understanding

Jingyang Chen이 [arXiv]에 게시한 'REVISOR: Beyond Textual Reflection, Towards Multimodal Introspective Reasoning in Long-Form Video Understanding' 논문에 대한 자세한 리뷰입니다.
#Review#Multimodal Reasoning#Long-Form Video Understanding#Self-Reflection#Reinforcement Learning#Tool-Augmented MLLMs#Visual Rethinking#Video Question Answering#Causal Attribution

[논문리뷰] Proactive Hearing Assistants that Isolate Egocentric Conversations

이 [arXiv]에 게시한 'Proactive Hearing Assistants that Isolate Egocentric Conversations' 논문에 대한 자세한 리뷰입니다.
#Review#Proactive Hearing Assistant#Egocentric Audio Processing#Speech Separation#Turn-taking Dynamics#Dual-Model Architecture#Real-time Inference#Wearable Devices#Dialogue Modeling

[논문리뷰] Mitigating Label Length Bias in Large Language Models

Katharina von der Wense이 [arXiv]에 게시한 'Mitigating Label Length Bias in Large Language Models' 논문에 대한 자세한 리뷰입니다.
#Review#Large Language Models#Label Bias#Calibration#In-Context Learning#Text Classification#Multi-token Labels#Label Length Bias#Multiple Choice QA

[논문리뷰] A Brain Wave Encodes a Thousand Tokens: Modeling Inter-Cortical Neural Interactions for Effective EEG-based Emotion Recognition

G. Maragatham이 [arXiv]에 게시한 'A Brain Wave Encodes a Thousand Tokens: Modeling Inter-Cortical Neural Interactions for Effective EEG-based Emotion Recognition' 논문에 대한 자세한 리뷰입니다.
#Review#EEG#Emotion Recognition#Transformer Architecture#Inter-Cortical Neural Interactions#Multi-Head Attention#Brain-Computer Interface#Affective Computing

[논문리뷰] UFO^3: Weaving the Digital Agent Galaxy

이 [arXiv]에 게시한 'UFO^3: Weaving the Digital Agent Galaxy' 논문에 대한 자세한 리뷰입니다.
#Review#Multi-Agent Systems#Cross-Device Orchestration#LLM-Powered Agents#Task Constellation#Directed Acyclic Graph (DAG)#Agent Interaction Protocol (AIP)#Fault Tolerance#Asynchronous Execution

[논문리뷰] MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling

cyyang822이 [arXiv]에 게시한 'MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling' 논문에 대한 자세한 리뷰입니다.
#Review#Research Agent#Tool-Augmented Reasoning#Interaction Scaling#Large Language Models#Reinforcement Learning#Context Management#Open-Source AI

[논문리뷰] MicroVQA++: High-Quality Microscopy Reasoning Dataset with Weakly Supervised Graphs for Multimodal Large Language Model

Bo Yan이 [arXiv]에 게시한 'MicroVQA++: High-Quality Microscopy Reasoning Dataset with Weakly Supervised Graphs for Multimodal Large Language Model' 논문에 대한 자세한 리뷰입니다.
#Review#Microscopy VQA#Multimodal LLM#Weak Supervision#Graph Neural Networks#Dataset Generation#Biomedical Imaging#Scientific Reasoning#Cross-Modal Consistency

[논문리뷰] Genomic Next-Token Predictors are In-Context Learners

이 [arXiv]에 게시한 'Genomic Next-Token Predictors are In-Context Learners' 논문에 대한 자세한 리뷰입니다.
#Review#In-Context Learning (ICL)#Genomic Sequences#Next-Token Prediction#Large Language Models (LLMs)#Modality-Agnostic AI#Meta-Learning#Bitstring Program Synthesis#Evo2

[논문리뷰] Workload Schedulers -- Genesis, Algorithms and Differences

Vladimir Getov이 [arXiv]에 게시한 'Workload Schedulers -- Genesis, Algorithms and Differences' 논문에 대한 자세한 리뷰입니다.
#Review#Workload Scheduling#Process Scheduling#Job Scheduling#Big Data Processing#Resource Management#Distributed Systems#Scheduling Algorithms#Performance Optimization

[논문리뷰] Virtual Width Networks

이 [arXiv]에 게시한 'Virtual Width Networks' 논문에 대한 자세한 리뷰입니다.
#Review#Virtual Width Networks#Transformer#Mixture-of-Experts (MoE)#Scaling Laws#Representation Learning#Model Efficiency#Multi-Token Prediction#Hyper-Connections

[논문리뷰] HI-TransPA: Hearing Impairments Translation Personal Assistant

이 [arXiv]에 게시한 'HI-TransPA: Hearing Impairments Translation Personal Assistant' 논문에 대한 자세한 리뷰입니다.
#Review#Multimodal AI#Hearing Impairment#Audio-Visual Speech Recognition#Curriculum Learning#Omni-Models#Assistive Technology#Lip Reading#Speech Translation

[논문리뷰] DoPE: Denoising Rotary Position Embedding

Min Yang이 [arXiv]에 게시한 'DoPE: Denoising Rotary Position Embedding' 논문에 대한 자세한 리뷰입니다.
#Review#Rotary Position Embedding#Transformer#Length Extrapolation#Attention Sink#Matrix Entropy#Denoising#Large Language Models

[논문리뷰] CATS-V2V: A Real-World Vehicle-to-Vehicle Cooperative Perception Dataset with Complex Adverse Traffic Scenarios

Juyoung Oh이 [arXiv]에 게시한 'CATS-V2V: A Real-World Vehicle-to-Vehicle Cooperative Perception Dataset with Complex Adverse Traffic Scenarios' 논문에 대한 자세한 리뷰입니다.
#Review#Cooperative Perception#Vehicle-to-Vehicle (V2V)#Autonomous Driving#Dataset#Adverse Traffic Scenarios#Sensor Fusion#Temporal Alignment#3D Bounding Box Annotation

[논문리뷰] A Meta-Heuristic Load Balancer for Cloud Computing Systems

Vladimir Getov이 [arXiv]에 게시한 'A Meta-Heuristic Load Balancer for Cloud Computing Systems' 논문에 대한 자세한 리뷰입니다.
#Review#Cloud Computing#Load Balancing#Meta-Heuristic#Genetic Algorithm#Simulated Annealing#Tabu Search#Resource Management#Service Migration

[논문리뷰] Depth Anything 3: Recovering the Visual Space from Any Views

이 [arXiv]에 게시한 'Depth Anything 3: Recovering the Visual Space from Any Views' 논문에 대한 자세한 리뷰입니다.
#Review#Depth Estimation#Multi-view Geometry#Transformer Architecture#Teacher-Student Learning#Pose Estimation#3D Reconstruction#Novel View Synthesis#Visual Space Recovery

[논문리뷰] Black-Box On-Policy Distillation of Large Language Models

이 [arXiv]에 게시한 'Black-Box On-Policy Distillation of Large Language Models' 논문에 대한 자세한 리뷰입니다.
#Review#Large Language Models (LLMs)#Knowledge Distillation (KD)#Black-box Distillation#Generative Adversarial Networks (GANs)#On-policy Learning#Reinforcement Learning#Minimax Game#Model Compression

[논문리뷰] TiDAR: Think in Diffusion, Talk in Autoregression

이 [arXiv]에 게시한 'TiDAR: Think in Diffusion, Talk in Autoregression' 논문에 대한 자세한 리뷰입니다.
#Review#Hybrid LLM Architecture#Diffusion-Autoregressive#Parallel Token Generation#Speculative Decoding#Structured Attention Masks#LLM Inference Acceleration#KV Cache

[논문리뷰] Stemming Hallucination in Language Models Using a Licensing Oracle

Richard Ackermann이 [arXiv]에 게시한 'Stemming Hallucination in Language Models Using a Licensing Oracle' 논문에 대한 자세한 리뷰입니다.
#Review#Hallucination Mitigation#Language Models#Knowledge Graphs#SHACL Validation#Epistemic Grounding#Retrieval-Augmented Generation#Neuro-symbolic AI

[논문리뷰] Motif 2 12.7B technical report

이 [arXiv]에 게시한 'Motif 2 12.7B technical report' 논문에 대한 자세한 리뷰입니다.
#Review#Large Language Model#LLM Efficiency#Grouped Differential Attention#Kernel Fusion#Parallel Muon#Supervised Fine-tuning#Architectural Scaling#Instruction Following

[논문리뷰] MADD: Multi-Agent Drug Discovery Orchestra

이 [arXiv]에 게시한 'MADD: Multi-Agent Drug Discovery Orchestra' 논문에 대한 자세한 리뷰입니다.
#Review#Multi-Agent System#Drug Discovery#LLM#Hit Identification#Virtual Screening#Generative AI#Property Prediction#Automated Machine Learning

[논문리뷰] Agentic Refactoring: An Empirical Study of AI Coding Agents

Hajimu Iida이 [arXiv]에 게시한 'Agentic Refactoring: An Empirical Study of AI Coding Agents' 논문에 대한 자세한 리뷰입니다.
#Review#AI Agents#Code Refactoring#Software Engineering#Empirical Study#Large Language Models#Code Quality#Agentic Software Development#Maintainability

[논문리뷰] Adapting Web Agents with Synthetic Supervision

Siwei Han이 [arXiv]에 게시한 'Adapting Web Agents with Synthetic Supervision' 논문에 대한 자세한 리뷰입니다.
#Review#Web Agents#Synthetic Data Generation#LLM#Task Refinement#Trajectory Refinement#Supervised Fine-tuning#Web Automation#Environment Adaptation

[논문리뷰] VideoSSR: Video Self-Supervised Reinforcement Learning

이 [arXiv]에 게시한 'VideoSSR: Video Self-Supervised Reinforcement Learning' 논문에 대한 자세한 리뷰입니다.
#Review#Video Understanding#Self-Supervised Learning#Reinforcement Learning#MLLMs#Pretext Tasks#Verifiable Rewards#Data Generation#Temporal Grounding

[논문리뷰] Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

이 [arXiv]에 게시한 'Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B' 논문에 대한 자세한 리뷰입니다.
#Review#Small Language Models#Reasoning#Diversity Optimization#Supervised Fine-Tuning (SFT)#Reinforcement Learning (RL)#Spectrum-to-Signal Principle (SSP)#Mathematical Reasoning#Code Generation

[논문리뷰] Grounding Computer Use Agents on Human Demonstrations

이 [arXiv]에 게시한 'Grounding Computer Use Agents on Human Demonstrations' 논문에 대한 자세한 리뷰입니다.
#Review#Computer Use Agents#UI Grounding#Desktop Applications#Human Demonstrations#Large-Scale Dataset#Vision-Language Models#Supervised Fine-tuning#Reinforcement Learning

[논문리뷰] DynaAct: Large Language Model Reasoning with Dynamic Action Spaces

Lingpeng Kong이 [arXiv]에 게시한 'DynaAct: Large Language Model Reasoning with Dynamic Action Spaces' 논문에 대한 자세한 리뷰입니다.
#Review#Large Language Models#Sequential Reasoning#Action Space Construction#Submodular Optimization#Markov Decision Process#Monte Carlo Tree Search#Utility-Diversity Trade-off

[논문리뷰] The Station: An Open-World Environment for AI-Driven Discovery

wydu이 [arXiv]에 게시한 'The Station: An Open-World Environment for AI-Driven Discovery' 논문에 대한 자세한 리뷰입니다.
#Review#Multi-Agent System#Open-World Environment#Scientific Discovery#AI-Driven Research#Large Language Models#Emergent Behavior#State-of-the-Art (SOTA)

[논문리뷰] Robot Learning from a Physical World Model

이 [arXiv]에 게시한 'Robot Learning from a Physical World Model' 논문에 대한 자세한 리뷰입니다.
#Review#Robot Learning#Video Generation#Physical World Model#Reinforcement Learning#Zero-shot Manipulation#Object-Centric Learning#Sim-to-Real

[논문리뷰] Reasoning with Confidence: Efficient Verification of LLM Reasoning Steps via Uncertainty Heads

Jiaheng Zhang이 [arXiv]에 게시한 'Reasoning with Confidence: Efficient Verification of LLM Reasoning Steps via Uncertainty Heads' 논문에 대한 자세한 리뷰입니다.
#Review#LLM Reasoning Verification#Uncertainty Quantification (UQ)#UHeads#Process Reward Models (PRMs)#Chain-of-Thought (CoT)#Self-Supervised Learning#Computational Efficiency#Domain Generalization

[논문리뷰] MPJudge: Towards Perceptual Assessment of Music-Induced Paintings

이 [arXiv]에 게시한 'MPJudge: Towards Perceptual Assessment of Music-Induced Paintings' 논문에 대한 자세한 리뷰입니다.
#Review#Music-Painting Cross-Modal#Perceptual Assessment#Modality-Adaptive Normalization#Direct Preference Optimization#Cross-Modal Fusion#Dataset Annotation#Affective Computing

[논문리뷰] DRIVE: Data Curation Best Practices for Reinforcement Learning with Verifiable Reward in Competitive Code Generation

이 [arXiv]에 게시한 'DRIVE: Data Curation Best Practices for Reinforcement Learning with Verifiable Reward in Competitive Code Generation' 논문에 대한 자세한 리뷰입니다.
#Review#Reinforcement Learning with Verifiable Reward#Competitive Programming#Code Generation#Data Curation#Curriculum Learning#Supervised Fine-tuning#Entropy Expansion

[논문리뷰] DIMO: Diverse 3D Motion Generation for Arbitrary Objects

Kostas Daniilidis이 [arXiv]에 게시한 'DIMO: Diverse 3D Motion Generation for Arbitrary Objects' 논문에 대한 자세한 리뷰입니다.
#Review#3D Motion Generation#Generative Models#Arbitrary Objects#Neural Key Points#Latent Space#4D Content Generation#Diffusion Models#3D Gaussian Splatting

[논문리뷰] Visual Spatial Tuning

이 [arXiv]에 게시한 'Visual Spatial Tuning' 논문에 대한 자세한 리뷰입니다.
#Review#Vision-Language Models#Spatial Reasoning#Spatial Perception#Dataset Creation#Reinforcement Learning#Visuospatial AI#Robotics

[논문리뷰] Real-Time Reasoning Agents in Evolving Environments

이 [arXiv]에 게시한 'Real-Time Reasoning Agents in Evolving Environments' 논문에 대한 자세한 리뷰입니다.
#Review#Real-time Reasoning#LLM Agents#Dynamic Environments#Dual-System AI#AgileThinker#Reactive Planning#Cognitive Load#Time Pressure

[논문리뷰] Jailbreaking in the Haystack

Alexander Robey이 [arXiv]에 게시한 'Jailbreaking in the Haystack' 논문에 대한 자세한 리뷰입니다.
#Review#Jailbreaking#LLM Safety#Long-Context Models#Positional Bias#Attack Success Rate (ASR)#Prompt Engineering#Compute Efficiency#AI Agents

[논문리뷰] Dense Motion Captioning

Paolo Rota이 [arXiv]에 게시한 'Dense Motion Captioning' 논문에 대한 자세한 리뷰입니다.
#Review#3D Human Motion#Dense Captioning#Large Language Models#Motion Understanding#Temporal Localization#Human-Language Datasets#Motion Generation

[논문리뷰] DeepEyesV2: Toward Agentic Multimodal Model

Guohai Xu이 [arXiv]에 게시한 'DeepEyesV2: Toward Agentic Multimodal Model' 논문에 대한 자세한 리뷰입니다.
#Review#Agentic AI#Multimodal Models#Tool Use#Reinforcement Learning#Supervised Fine-tuning#Multimodal Reasoning#Web Search#Code Execution

[논문리뷰] V-Thinker: Interactive Thinking with Images

Peiqing Yang이 [arXiv]에 게시한 'V-Thinker: Interactive Thinking with Images' 논문에 대한 자세한 리뷰입니다.
#Review#Large Multimodal Models#Interactive Reasoning#Vision-Centric Thinking#Reinforcement Learning#Data Synthesis#Visual Tools#Curriculum Learning#Multimodal AI

[논문리뷰] Scaling Agent Learning via Experience Synthesis

이 [arXiv]에 게시한 'Scaling Agent Learning via Experience Synthesis' 논문에 대한 자세한 리뷰입니다.
#Review#Reinforcement Learning#LLM Agents#Experience Synthesis#World Models#Curriculum Learning#Sim-to-Real Transfer#Web Agents

[논문리뷰] NVIDIA Nemotron Nano V2 VL

이 [arXiv]에 게시한 'NVIDIA Nemotron Nano V2 VL' 논문에 대한 자세한 리뷰입니다.
#Review#Vision-Language Model#Hybrid Architecture#Mamba-Transformer#Long-Context Understanding#Quantization#Efficient Inference#Document AI#Video AI

[논문리뷰] How to Evaluate Speech Translation with Source-Aware Neural MT Metrics

Luisa Bentivogli이 [arXiv]에 게시한 'How to Evaluate Speech Translation with Source-Aware Neural MT Metrics' 논문에 대한 자세한 리뷰입니다.
#Review#Speech Translation#Neural MT Metrics#Source-Aware Evaluation#Automatic Speech Recognition (ASR)#Back-Translation (BT)#Cross-lingual Re-segmentation#COMET#MetricX

[논문리뷰] Cambrian-S: Towards Spatial Supersensing in Video

Zihao Yang이 [arXiv]에 게시한 'Cambrian-S: Towards Spatial Supersensing in Video' 논문에 대한 자세한 리뷰입니다.
#Review#Spatial Supersensing#Video Understanding#Multimodal LLMs#Predictive Sensing#Memory Management#Event Segmentation#VSI-SUPER#Instruction Tuning

[논문리뷰] Diffusion Language Models are Super Data Learners

이 [arXiv]에 게시한 'Diffusion Language Models are Super Data Learners' 논문에 대한 자세한 리뷰입니다.
#Review#Diffusion Language Models#Autoregressive Models#Data Efficiency#Scaling Laws#Data-Constrained Learning#Crossover Phenomenon#Pre-training#Masked Diffusion

[논문리뷰] iFlyBot-VLA Technical Report

Jiajia wu이 [arXiv]에 게시한 'iFlyBot-VLA Technical Report' 논문에 대한 자세한 리뷰입니다.
#Review#Vision-Language-Action Models#Robotics#Imitation Learning#Latent Actions#Diffusion Models#Dual-Arm Manipulation#Pretraining#Flow-Matching

[논문리뷰] The Collaboration Gap

이 [arXiv]에 게시한 'The Collaboration Gap' 논문에 대한 자세한 리뷰입니다.
#Review#AI Collaboration#Multi-Agent Systems#Large Language Models (LLMs)#Maze Solving#Heterogeneous Agents#Collaboration Gap#Relay Inference#Agentic AI

[논문리뷰] Step-Audio-EditX Technical Report

이 [arXiv]에 게시한 'Step-Audio-EditX Technical Report' 논문에 대한 자세한 리뷰입니다.
#Review#LLM-based Audio Model#Audio Editing#Text-to-Speech (TTS)#Zero-shot Learning#Large-Margin Data#Reinforcement Learning (RLHF)#Emotion Control#Speaking Style Transfer

[논문리뷰] CodeClash: Benchmarking Goal-Oriented Software Engineering

이 [arXiv]에 게시한 'CodeClash: Benchmarking Goal-Oriented Software Engineering' 논문에 대한 자세한 리뷰입니다.
#Review#Software Engineering Benchmarking#Language Models#AI Agents#Goal-Oriented Development#Competitive Programming#Code Evolution#Strategic Reasoning#Autonomous Systems

[논문리뷰] ChartM^3: A Multi-Stage Code-Driven Pipeline for Constructing Multi-Dimensional and Multi-Step Visual Reasoning Data in Chart Comprehension

Hao Wang이 [arXiv]에 게시한 'ChartM^3: A Multi-Stage Code-Driven Pipeline for Constructing Multi-Dimensional and Multi-Step Visual Reasoning Data in Chart Comprehension' 논문에 대한 자세한 리뷰입니다.
#Review#Chart Comprehension#Visual Reasoning#Data Generation#Code-Driven Pipeline#Multimodal LLMs#Retrieval-Augmented Generation#Reinforcement Learning#Synthetic Data

[논문리뷰] left|,circlearrowright,text{BUS},right|: A Large and Diverse Multimodal Benchmark for evaluating the ability of Vision-Language Models to understand Rebus Puzzles

Deepiha S이 [arXiv]에 게시한 'left|,circlearrowright,text{BUS},right|: A Large and Diverse Multimodal Benchmark for evaluating the ability of Vision-Language Models to understand Rebus Puzzles' 논문에 대한 자세한 리뷰입니다.
#Review#Vision-Language Models#Multimodal Benchmark#Rebus Puzzles#In-Context Learning#Reasoning#ControlNet#Prompt Engineering

[논문리뷰] World Simulation with Video Foundation Models for Physical AI

Junjie Bai이 [arXiv]에 게시한 'World Simulation with Video Foundation Models for Physical AI' 논문에 대한 자세한 리뷰입니다.
#Review#Physical AI#World Simulation#Video Foundation Models#Flow Matching#Reinforcement Learning#Robotics#Autonomous Driving#Synthetic Data Generation

[논문리뷰] Vote-in-Context: Turning VLMs into Zero-Shot Rank Fusers

이 [arXiv]에 게시한 'Vote-in-Context: Turning VLMs into Zero-Shot Rank Fusers' 논문에 대한 자세한 리뷰입니다.
#Review#Video Retrieval#Vision-Language Models (VLMs)#Zero-Shot Learning#List-wise Reranking#Rank Fusion#Prompt Engineering#S-Grid#Multimodal Retrieval

[논문리뷰] Trove: A Flexible Toolkit for Dense Retrieval

이 [arXiv]에 게시한 'Trove: A Flexible Toolkit for Dense Retrieval' 논문에 대한 자세한 리뷰입니다.
#Review#Dense Retrieval#Retrieval Toolkit#Data Management#Distributed Training#Model Customization#Hard Negative Mining#Hugging Face Integration#Performance Optimization

[논문리뷰] Towards Robust Mathematical Reasoning

Yuri Chervonyi이 [arXiv]에 게시한 'Towards Robust Mathematical Reasoning' 논문에 대한 자세한 리뷰입니다.
#Review#Mathematical Reasoning#Large Language Models (LLMs)#AI Benchmarks#International Mathematical Olympiad (IMO)#Proof Verification#Automatic Grading#Robustness

[논문리뷰] PHUMA: Physically-Grounded Humanoid Locomotion Dataset

이 [arXiv]에 게시한 'PHUMA: Physically-Grounded Humanoid Locomotion Dataset' 논문에 대한 자세한 리뷰입니다.
#Review#Humanoid Locomotion#Dataset#Motion Imitation#Physics-based Control#Motion Retargeting#Data Curation#Reinforcement Learning#Inverse Kinematics

[논문리뷰] OpenSIR: Open-Ended Self-Improving Reasoner

이 [arXiv]에 게시한 'OpenSIR: Open-Ended Self-Improving Reasoner' 논문에 대한 자세한 리뷰입니다.
#Review#Open-Ended Learning#Self-Play#Reinforcement Learning#Large Language Models#Mathematical Reasoning#Problem Generation#Curriculum Learning#Reward Shaping

[논문리뷰] LongCat-Flash-Omni Technical Report

Bin Xiao이 [arXiv]에 게시한 'LongCat-Flash-Omni Technical Report' 논문에 대한 자세한 리뷰입니다.
#Review#Omni-modal AI#Multimodal LLM#Real-time Interaction#Mixture-of-Experts (MoE)#Streaming Inference#Distributed Training#Curriculum Learning#Audio-Visual Perception

[논문리뷰] Data-Efficient RLVR via Off-Policy Influence Guidance

Jiale Cheng이 [arXiv]에 게시한 'Data-Efficient RLVR via Off-Policy Influence Guidance' 논문에 대한 자세한 리뷰입니다.
#Review#Reinforcement Learning with Verifiable Rewards (RLVR)#Influence Functions#Data Selection#Off-Policy Learning#Curriculum Learning#Large Language Models (LLMs)#Sparse Random Projection#Data Efficiency

[논문리뷰] Revisiting Multimodal Positional Encoding in Vision-Language Models

이 [arXiv]에 게시한 'Revisiting Multimodal Positional Encoding in Vision-Language Models' 논문에 대한 자세한 리뷰입니다.
#Review#Multimodal Positional Encoding#Vision-Language Models#Rotary Positional Embedding (RoPE)#Transformer#Multimodal Understanding#Visual Grounding#Frequency Allocation#Position Design

[논문리뷰] Monopoly Deal: A Benchmark Environment for Bounded One-Sided Response Games

cavaunpeu이 [arXiv]에 게시한 'Monopoly Deal: A Benchmark Environment for Bounded One-Sided Response Games' 논문에 대한 자세한 리뷰입니다.
#Review#Bounded One-Sided Response Games (BORGs)#Monopoly Deal#Benchmark Environment#Counterfactual Regret Minimization (CFR)#Imperfect Information Games#Game Theory#Self-Play#State Abstraction

[논문리뷰] MisSynth: Improving MISSCI Logical Fallacies Classification with Synthetic Data

Nadiya Shvai이 [arXiv]에 게시한 'MisSynth: Improving MISSCI Logical Fallacies Classification with Synthetic Data' 논문에 대한 자세한 리뷰입니다.
#Review#Health Misinformation#Logical Fallacy Classification#Synthetic Data Generation#Large Language Models (LLMs)#Retrieval-Augmented Generation (RAG)#Parameter-Efficient Fine-tuning (PEFT)#LoRA#MISSCI Benchmark

[논문리뷰] Mask-to-Height: A YOLOv11-Based Architecture for Joint Building Instance Segmentation and Height Classification from Satellite Imagery

Oğuz Hanoğlu이 [arXiv]에 게시한 'Mask-to-Height: A YOLOv11-Based Architecture for Joint Building Instance Segmentation and Height Classification from Satellite Imagery' 논문에 대한 자세한 리뷰입니다.
#Review#Building Instance Segmentation#Height Classification#YOLOv11#Satellite Imagery#Multitask Learning#Remote Sensing#Urban Planning

[논문리뷰] Limits of Generalization in RLVR: Two Case Studies in Mathematical Reasoning

Nidhi Rastogi이 [arXiv]에 게시한 'Limits of Generalization in RLVR: Two Case Studies in Mathematical Reasoning' 논문에 대한 자세한 리뷰입니다.
#Review#Reinforcement Learning with Verifiable Rewards (RLVR)#Mathematical Reasoning#Large Language Models (LLMs)#Activity Scheduling#Longest Increasing Subsequence (LIS)#Generalization Limits#Reward Design#Self-consistency

[논문리뷰] Higher-order Linear Attention

이 [arXiv]에 게시한 'Higher-order Linear Attention' 논문에 대한 자세한 리뷰입니다.
#Review#Linear Attention#Higher-order Interactions#Causal Streaming#Associative Scans#Prefix Summaries#Transformer Architectures#State Space Models

[논문리뷰] Defeating the Training-Inference Mismatch via FP16

이 [arXiv]에 게시한 'Defeating the Training-Inference Mismatch via FP16' 논문에 대한 자세한 리뷰입니다.
#Review#Reinforcement Learning#LLM Fine-tuning#Training-Inference Mismatch#Floating Point Precision#FP16#BF16#RL Stability

[논문리뷰] Continuous Autoregressive Language Models

이 [arXiv]에 게시한 'Continuous Autoregressive Language Models' 논문에 대한 자세한 리뷰입니다.
#Review#Large Language Models (LLMs)#Continuous Representation#Autoencoder#Likelihood-Free Modeling#Energy-Based Models#Next-Vector Prediction#Computational Efficiency#Temperature Sampling

[논문리뷰] A Survey on Efficient Vision-Language-Action Models

이 [arXiv]에 게시한 'A Survey on Efficient Vision-Language-Action Models' 논문에 대한 자세한 리뷰입니다.
#Review#Embodied AI#Robotic Manipulation#VLA Models#Efficient AI#Model Compression#Efficient Training#Data Collection#Multimodal AI

[논문리뷰] PORTool: Tool-Use LLM Training with Rewarded Tree

이 [arXiv]에 게시한 'PORTool: Tool-Use LLM Training with Rewarded Tree' 논문에 대한 자세한 리뷰입니다.
#Review#Tool-Use LLM#Reinforcement Learning (RL)#Policy Optimization#Rewarded Tree#Trajectory Optimization#Agentic System#Dynamic Tool Call

[논문리뷰] L^2M^3OF: A Large Language Multimodal Model for Metal-Organic Frameworks

Xenophon Evangelopoulos이 [arXiv]에 게시한 'L^2M^3OF: A Large Language Multimodal Model for Metal-Organic Frameworks' 논문에 대한 자세한 리뷰입니다.
#Review#Multimodal LLM#Metal-Organic Frameworks (MOFs)#Materials Discovery#Crystal Representation Learning#Instruction Tuning#Structure-Property Prediction#Knowledge Generation

[논문리뷰] FullPart: Generating each 3D Part at Full Resolution

Chenjian Gao이 [arXiv]에 게시한 'FullPart: Generating each 3D Part at Full Resolution' 논문에 대한 자세한 리뷰입니다.
#Review#3D Part Generation#Full Resolution#Implicit Representation#Explicit Representation#Voxel Grid#Diffusion Models#PartVerse-XL#Center-Corner Encoding

[논문리뷰] Emu3.5: Native Multimodal Models are World Learners

이 [arXiv]에 게시한 'Emu3.5: Native Multimodal Models are World Learners' 논문에 대한 자세한 리뷰입니다.
#Review#Multimodal Model#World Model#Vision-Language#Next-Token Prediction#Reinforcement Learning#Discrete Diffusion Adaptation#Image Generation#Any-to-Image

[논문리뷰] The Principles of Diffusion Models

Stefano Ermon이 [arXiv]에 게시한 'The Principles of Diffusion Models' 논문에 대한 자세한 리뷰입니다.
#Review#Diffusion Models#Generative AI#Variational Autoencoder#Energy-Based Models#Normalizing Flows#Score-Based SDEs#Flow Matching#Fokker-Planck Equation

[논문리뷰] Scaling Latent Reasoning via Looped Language Models

이 [arXiv]에 게시한 'Scaling Latent Reasoning via Looped Language Models' 논문에 대한 자세한 리뷰입니다.
#Review#Looped Language Models#Latent Reasoning#Parameter Efficiency#Adaptive Computation#Pre-training Scaling#Knowledge Manipulation#Early Exit Mechanisms#Transformer Architecture

[논문리뷰] Reasoning-Aware GRPO using Process Mining

이 [arXiv]에 게시한 'Reasoning-Aware GRPO using Process Mining' 논문에 대한 자세한 리뷰입니다.
#Review#Reinforcement Learning#Large Language Models#Process Mining#Policy Optimization#Mathematical Reasoning#GRPO#PM4GRPO

[논문리뷰] PairUni: Pairwise Training for Unified Multimodal Language Models

이 [arXiv]에 게시한 'PairUni: Pairwise Training for Unified Multimodal Language Models' 논문에 대한 자세한 리뷰입니다.
#Review#Unified Vision-Language Models#Reinforcement Learning#Multimodal Alignment#Pairwise Training#Group Relative Policy Optimization#Data Augmentation#Text-to-Image Generation#Visual Reasoning

[논문리뷰] ODesign: A World Model for Biomolecular Interaction Design

Qinghan Wang이 [arXiv]에 게시한 'ODesign: A World Model for Biomolecular Interaction Design' 논문에 대한 자세한 리뷰입니다.
#Review#Biomolecular Interaction Design#Generative AI#World Model#Multimodal Molecular Design#All-atom Generation#Diffusion Models#Protein Design#Nucleic Acid Design

[논문리뷰] MASPRM: Multi-Agent System Process Reward Model

Ying Xiong이 [arXiv]에 게시한 'MASPRM: Multi-Agent System Process Reward Model' 논문에 대한 자세한 리뷰입니다.
#Review#Multi-Agent Systems#Process Reward Model#MCTS#Inference-time Search#LLM Agents#Zero-shot Transfer#Reinforcement Learning#Compute-Aware Reasoning

[논문리뷰] Fortytwo: Swarm Inference with Peer-Ranked Consensus

이 [arXiv]에 게시한 'Fortytwo: Swarm Inference with Peer-Ranked Consensus' 논문에 대한 자세한 리뷰입니다.
#Review#Decentralized AI#Swarm Intelligence#AI Inference#Consensus Mechanism#Peer-Ranking#Bradley-Terry Model#Reputation System#Sybil Defense

[논문리뷰] Evolving Diagnostic Agents in a Virtual Clinical Environment

이 [arXiv]에 게시한 'Evolving Diagnostic Agents in a Virtual Clinical Environment' 논문에 대한 자세한 리뷰입니다.
#Review#Large Language Models (LLMs)#Diagnostic Agents#Reinforcement Learning (RL)#Virtual Clinical Environment#Medical AI#Multi-turn Diagnosis#EHR (Electronic Health Records)

[논문리뷰] VisCoder2: Building Multi-Language Visualization Coding Agents

이 [arXiv]에 게시한 'VisCoder2: Building Multi-Language Visualization Coding Agents' 논문에 대한 자세한 리뷰입니다.
#Review#Multi-Language Visualization#Code Generation#Self-Debugging#Instruction Tuning#Large Language Models#Visualization Benchmark#Coding Agents#Code-Feedback

[논문리뷰] Tongyi DeepResearch Technical Report

이 [arXiv]에 게시한 'Tongyi DeepResearch Technical Report' 논문에 대한 자세한 리뷰입니다.
#Review#Agentic LLM#Deep Research#Information Seeking#Reinforcement Learning#Synthetic Data#Context Management#Tool Use#Open-source AI

[논문리뷰] Rethinking Visual Intelligence: Insights from Video Pretraining

Ahmad Rahimi이 [arXiv]에 게시한 'Rethinking Visual Intelligence: Insights from Video Pretraining' 논문에 대한 자세한 리뷰입니다.
#Review#Video Diffusion Models#Visual Intelligence#Pretraining#Foundation Models#Low-resource Learning#Inductive Biases#Visual Reasoning#Image-to-Image Tasks

[논문리뷰] Group Relative Attention Guidance for Image Editing

이 [arXiv]에 게시한 'Group Relative Attention Guidance for Image Editing' 논문에 대한 자세한 리뷰입니다.
#Review#Image Editing#Diffusion Transformers#Attention Mechanism#Guidance Mechanism#Controllability#Fine-grained Control#GRAG

[논문리뷰] VoMP: Predicting Volumetric Mechanical Property Fields

이 [arXiv]에 게시한 'VoMP: Predicting Volumetric Mechanical Property Fields' 논문에 대한 자세한 리뷰입니다.
#Review#Volumetric Properties#Mechanical Simulation#Material Prediction#3D Representation#Physics-based AI#Variational Autoencoder#Geometry Transformer#Gaussian Splats

[논문리뷰] RobotArena infty: Scalable Robot Benchmarking via Real-to-Sim Translation

Kuan-Hsun Tu이 [arXiv]에 게시한 'RobotArena infty: Scalable Robot Benchmarking via Real-to-Sim Translation' 논문에 대한 자세한 리뷰입니다.
#Review#Robot Benchmarking#Real-to-Sim Translation#Vision-Language Models (VLMs)#Human Preference Learning#Domain Randomization#Robot Manipulation#Simulation Environments#Policy Evaluation

[논문리뷰] MARS-M: When Variance Reduction Meets Matrices

이 [arXiv]에 게시한 'MARS-M: When Variance Reduction Meets Matrices' 논문에 대한 자세한 리뷰입니다.
#Review#Variance Reduction#Matrix-based Optimizer#LLM Training#Deep Learning Optimization#Moonlight#MARS-M#Stochastic Gradient Descent

[논문리뷰] LongCat-Video Technical Report

Hongyu Li이 [arXiv]에 게시한 'LongCat-Video Technical Report' 논문에 대한 자세한 리뷰입니다.
#Review#Video Generation#Diffusion Transformer#RLHF#Sparse Attention#Long Video Generation#Coarse-to-Fine Generation#Multi-task Learning#World Models

[논문리뷰] Language Server CLI Empowers Language Agents with Process Rewards

Lanser Contributors이 [arXiv]에 게시한 'Language Server CLI Empowers Language Agents with Process Rewards' 논문에 대한 자세한 리뷰입니다.
#Review#Language Agents#Language Server Protocol (LSP)#CLI#Process Rewards#Code Refactoring#Static Analysis#Reinforcement Learning#Deterministic Execution

[논문리뷰] Knocking-Heads Attention

Jianguo Li이 [arXiv]에 게시한 'Knocking-Heads Attention' 논문에 대한 자세한 리뷰입니다.
#Review#Multi-Head Attention#Transformer#Large Language Models#Inter-Head Communication#Parameter Sharing#Training Stability#Diagonal Initialization

[논문리뷰] FARMER: Flow AutoRegressive Transformer over Pixels

Zhijie Lin이 [arXiv]에 게시한 'FARMER: Flow AutoRegressive Transformer over Pixels' 논문에 대한 자세한 리뷰입니다.
#Review#Normalizing Flows#Autoregressive Models#Generative Models#Image Synthesis#Tractable Likelihood#Dimension Reduction#Distillation#Classifier-Free Guidance

[논문리뷰] DiffusionLane: Diffusion Model for Lane Detection

이 [arXiv]에 게시한 'DiffusionLane: Diffusion Model for Lane Detection' 논문에 대한 자세한 리뷰입니다.
#Review#Lane Detection#Diffusion Model#Denoising Diffusion#Hybrid Decoding#Anchor-based#Domain Adaptation#Computer Vision#Generative Models

[논문리뷰] Code Aesthetics with Agentic Reward Feedback

Yupan Huang이 [arXiv]에 게시한 'Code Aesthetics with Agentic Reward Feedback' 논문에 대한 자세한 리뷰입니다.
#Review#Code Aesthetics#Agentic Reward Feedback#Large Language Models#Reinforcement Learning#Instruction Tuning#Webpage Design#Multimodal Evaluation

[논문리뷰] WorldGrow: Generating Infinite 3D World

Jia Lu이 [arXiv]에 게시한 'WorldGrow: Generating Infinite 3D World' 논문에 대한 자세한 리뷰입니다.
#Review#3D World Generation#Infinite Scene Synthesis#Block-wise Generation#Coarse-to-Fine#3D Inpainting#Structured Latent Representation#Virtual Environments#World Models

[논문리뷰] Visual Diffusion Models are Geometric Solvers

Or Patashnik이 [arXiv]에 게시한 'Visual Diffusion Models are Geometric Solvers' 논문에 대한 자세한 리뷰입니다.
#Review#Diffusion Models#Geometric Problem Solving#Inscribed Square Problem#Steiner Tree Problem#Maximum Area Polygonization#Image Generation#Pixel Space

[논문리뷰] Video-As-Prompt: Unified Semantic Control for Video Generation

이 [arXiv]에 게시한 'Video-As-Prompt: Unified Semantic Control for Video Generation' 논문에 대한 자세한 리뷰입니다.
#Review#Video Generation#Semantic Control#Diffusion Transformers#In-Context Learning#Mixture-of-Transformers#Video-As-Prompt#Controllable Generation#Large-scale Dataset

[논문리뷰] Sparser Block-Sparse Attention via Token Permutation

이 [arXiv]에 게시한 'Sparser Block-Sparse Attention via Token Permutation' 논문에 대한 자세한 리뷰입니다.
#Review#Large Language Models (LLMs)#Self-Attention#Block-Sparse Attention#Token Permutation#Computational Efficiency#Prefilling#Long Context#Causal Attention

[논문리뷰] Soft Instruction De-escalation Defense

이 [arXiv]에 게시한 'Soft Instruction De-escalation Defense' 논문에 대한 자세한 리뷰입니다.
#Review#Prompt Injection#LLM Security#Agentic Systems#Iterative Sanitization#Instruction Control#Adversarial Robustness#Large Language Models

[논문리뷰] Model Merging with Functional Dual Anchors

이 [arXiv]에 게시한 'Model Merging with Functional Dual Anchors' 논문에 대한 자세한 리뷰입니다.
#Review#Model Merging#Functional Dual Anchors#Input-Representation Space#Task Vectors#Knowledge Integration#Foundation Models#Gradient Matching#Post-training Strategy

[논문리뷰] Map the Flow: Revealing Hidden Pathways of Information in VideoLLMs

Bohyung Han이 [arXiv]에 게시한 'Map the Flow: Revealing Hidden Pathways of Information in VideoLLMs' 논문에 대한 자세한 리뷰입니다.
#Review#Video Large Language Models#VideoQA#Mechanistic Interpretability#Attention Knockout#Temporal Reasoning#Information Flow#Model Interpretability#Logit Lens

[논문리뷰] ALICE-LRI: A General Method for Lossless Range Image Generation for Spinning LiDAR Sensors without Calibration Metadata

José C. Cabaleiro이 [arXiv]에 게시한 'ALICE-LRI: A General Method for Lossless Range Image Generation for Spinning LiDAR Sensors without Calibration Metadata' 논문에 대한 자세한 리뷰입니다.
#Review#LiDAR#Range Image#Lossless Projection#Sensor Calibration#Intrinsic Parameters#Point Cloud Reconstruction#Hough Transform#Weighted Least Squares

[논문리뷰] A Definition of AGI

Yarin Gal이 [arXiv]에 게시한 'A Definition of AGI' 논문에 대한 자세한 리뷰입니다.
#Review#AGI Definition#Cognitive Assessment#Cattell-Horn-Carroll Theory#AI Evaluation#Multimodal AI#Cognitive Domains#Psychometrics

[논문리뷰] Thought Communication in Multiagent Collaboration

Mingze Gao이 [arXiv]에 게시한 'Thought Communication in Multiagent Collaboration' 논문에 대한 자세한 리뷰입니다.
#Review#Multiagent Systems#LLM Communication#Latent Variable Models#Identifiability Theory#Thought Communication#Sparse Autoencoder#Prefix Tuning

[논문리뷰] The Massive Legal Embedding Benchmark (MLEB)

이 [arXiv]에 게시한 'The Massive Legal Embedding Benchmark (MLEB)' 논문에 대한 자세한 리뷰입니다.
#Review#Legal Information Retrieval#Embedding Models#Benchmark Dataset#Natural Language Processing#Retrieval-Augmented Generation#Jurisdictional Diversity#Legal Tech

[논문리뷰] Emergence of Linear Truth Encodings in Language Models

Alberto Bietti이 [arXiv]에 게시한 'Emergence of Linear Truth Encodings in Language Models' 논문에 대한 자세한 리뷰입니다.
#Review#Language Models#Truth Encoding#Linear Subspaces#Mechanistic Interpretability#Transformer Models#Learning Dynamics#Truth Co-occurrence Hypothesis#Hallucinations

[논문리뷰] ComProScanner: A multi-agent based framework for composition-property structured data extraction from scientific literature

이 [arXiv]에 게시한 'ComProScanner: A multi-agent based framework for composition-property structured data extraction from scientific literature' 논문에 대한 자세한 리뷰입니다.
#Review#Multi-agent Systems#Large Language Models (LLMs)#Information Extraction#Scientific Literature#Materials Science#Data Curation#Piezoelectric Materials#RAG (Retrieval-Augmented Generation)

[논문리뷰] ARGenSeg: Image Segmentation with Autoregressive Image Generation Model

이 [arXiv]에 게시한 'ARGenSeg: Image Segmentation with Autoregressive Image Generation Model' 논문에 대한 자세한 리뷰입니다.
#Review#Image Segmentation#Autoregressive Generation#Multimodal Large Language Models (MLLMs)#Visual Understanding#VQ-VAE#Multi-scale Prediction#Referring Expression Segmentation#Image Generation

[논문리뷰] olmOCR 2: Unit Test Rewards for Document OCR

이 [arXiv]에 게시한 'olmOCR 2: Unit Test Rewards for Document OCR' 논문에 대한 자세한 리뷰입니다.
#Review#Document OCR#Vision Language Model#Reinforcement Learning#Unit Tests#Synthetic Data Generation#RLVR#Document Parsing#State-of-the-Art OCR

[논문리뷰] OmniNWM: Omniscient Driving Navigation World Models

Zhujin Liang이 [arXiv]에 게시한 'OmniNWM: Omniscient Driving Navigation World Models' 논문에 대한 자세한 리뷰입니다.
#Review#Autonomous Driving#World Models#Multi-modal Generation#3D Occupancy#Plücker Ray-maps#Action Control#Dense Rewards#Long-term Forecasting

[논문리뷰] Machine Text Detectors are Membership Inference Attacks

Naoaki Okazaki이 [arXiv]에 게시한 'Machine Text Detectors are Membership Inference Attacks' 논문에 대한 자세한 리뷰입니다.
#Review#Membership Inference Attacks#Machine-Generated Text Detection#Transferability#Likelihood Ratio Test#Large Language Models#Zero-Shot Detection#Model Security#AI Safety

[논문리뷰] Language Models are Injective and Hence Invertible

이 [arXiv]에 게시한 'Language Models are Injective and Hence Invertible' 논문에 대한 자세한 리뷰입니다.
#Review#Language Models#Injectivity#Invertibility#Transformer#Representation Learning#Exact Recovery#SIPIT Algorithm#Real Analysis

[논문리뷰] FinSight: Towards Real-World Financial Deep Research

Yutao Zhu이 [arXiv]에 게시한 'FinSight: Towards Real-World Financial Deep Research' 논문에 대한 자세한 리뷰입니다.
#Review#Financial Research#Multi-Agent System#Code Generation#Multimodal Reports#Iterative Visualization#Variable Memory#Deep Learning

[논문리뷰] Directional Reasoning Injection for Fine-Tuning MLLMs

Jialian Wu이 [arXiv]에 게시한 'Directional Reasoning Injection for Fine-Tuning MLLMs' 논문에 대한 자세한 리뷰입니다.
#Review#Multimodal LLMs#Reasoning Transfer#Gradient-based Fine-tuning#Model Merging#Parameter-Efficient Learning#Supervised Fine-tuning#Directional Prior

[논문리뷰] Attention Sinks in Diffusion Language Models

Simone Scardapane이 [arXiv]에 게시한 'Attention Sinks in Diffusion Language Models' 논문에 대한 자세한 리뷰입니다.
#Review#Diffusion Language Models#Attention Sinks#Transformer Architecture#Masked Language Modeling#Bidirectional Attention#Generative Models#Robustness#Dynamic Attention

[논문리뷰] World-in-World: World Models in a Closed-Loop World

Arda Uzunoglu이 [arXiv]에 게시한 'World-in-World: World Models in a Closed-Loop World' 논문에 대한 자세한 리뷰입니다.
#Review#World Models#Embodied AI#Closed-Loop Evaluation#Online Planning#Data Scaling#Controllability#Robotic Manipulation

[논문리뷰] Video Reasoning without Training

이 [arXiv]에 게시한 'Video Reasoning without Training' 논문에 대한 자세한 리뷰입니다.
#Review#Video Reasoning#Large Multimodal Models (LMMs)#Inference-Time Optimization#Entropy-Based Objective#Training-Free#KV-Cache Steering#Micro-Exploration#Macro-Exploitation

[논문리뷰] Unleashing Scientific Reasoning for Bio-experimental Protocol Generation via Structured Component-based Reward Mechanism

Shuang Gu이 [arXiv]에 게시한 'Unleashing Scientific Reasoning for Bio-experimental Protocol Generation via Structured Component-based Reward Mechanism' 논문에 대한 자세한 리뷰입니다.
#Review#Scientific Reasoning#Bio-experimental Protocol Generation#LLM#Structured Reward#SciRecipe Dataset#Sketch-and-Fill#Reinforcement Learning#Thoth

[논문리뷰] Extracting alignment data in open models

이 [arXiv]에 게시한 'Extracting alignment data in open models' 논문에 대한 자세한 리뷰입니다.
#Review#Alignment Data Extraction#Large Language Models#Memorization#Neural Embeddings#Semantic Similarity#Chat Templates#Model Distillation#Reinforcement Learning#Supervised Finetuning

[논문리뷰] DSI-Bench: A Benchmark for Dynamic Spatial Intelligence

이 [arXiv]에 게시한 'DSI-Bench: A Benchmark for Dynamic Spatial Intelligence' 논문에 대한 자세한 리뷰입니다.
#Review#Dynamic Spatial Reasoning#Vision-Language Models (VLMs)#Benchmark#Video Understanding#Motion Perception#3D Spatial Intelligence#Hallucinations#Bias

[논문리뷰] Chem-R: Learning to Reason as a Chemist

이 [arXiv]에 게시한 'Chem-R: Learning to Reason as a Chemist' 논문에 대한 자세한 리뷰입니다.
#Review#Chemical Reasoning#Large Language Models#Chem-R#Structured Reasoning#Multi-task Optimization#Chain-of-Thought#Chemical Discovery

[논문리뷰] RL makes MLLMs see better than SFT

이 [arXiv]에 게시한 'RL makes MLLMs see better than SFT' 논문에 대한 자세한 리뷰입니다.
#Review#Multimodal Language Models#Reinforcement Learning#Supervised Finetuning#Vision Encoder#Visual Representations#Direct Preference Optimization#Preference Alignment#PIVOT

[논문리뷰] On Non-interactive Evaluation of Animal Communication Translators

Adam Tauman Kalai이 [arXiv]에 게시한 'On Non-interactive Evaluation of Animal Communication Translators' 논문에 대한 자세한 리뷰입니다.
#Review#Machine Translation Quality Evaluation#Reference-Free Evaluation#Animal Communication#Language Models#Shuffle Test#Conlangs#Non-interactive Evaluation

[논문리뷰] FineVision: Open Data Is All You Need

이 [arXiv]에 게시한 'FineVision: Open Data Is All You Need' 논문에 대한 자세한 리뷰입니다.
#Review#Multimodal Datasets#VLM#Data Curation#Data Hygiene#De-duplication#Human-in-the-loop#GUI Automation#Test-set Decontamination

[논문리뷰] Executable Knowledge Graphs for Replicating AI Research

이 [arXiv]에 게시한 'Executable Knowledge Graphs for Replicating AI Research' 논문에 대한 자세한 리뷰입니다.
#Review#AI Research Replication#Large Language Models (LLMs)#Knowledge Graphs (KGs)#Executable Code Generation#Retrieval-Augmented Generation (RAG)#PaperBench#Automated AI Research

[논문리뷰] Deep Self-Evolving Reasoning

이 [arXiv]에 게시한 'Deep Self-Evolving Reasoning' 논문에 대한 자세한 리뷰입니다.
#Review#Deep Self-Evolving Reasoning#LLMs#Iterative Reasoning#Markov Chain#Self-Verification#Self-Refinement#Mathematical Reasoning#AIME Benchmark

[논문리뷰] Chronos-2: From Univariate to Universal Forecasting

이 [arXiv]에 게시한 'Chronos-2: From Univariate to Universal Forecasting' 논문에 대한 자세한 리뷰입니다.
#Review#Time Series Forecasting#Foundation Models#Pretrained Models#Transformer#In-Context Learning#Multivariate Forecasting#Covariates#Group Attention

[논문리뷰] Balanced Multi-Task Attention for Satellite Image Classification: A Systematic Approach to Achieving 97.23% Accuracy on EuroSAT Without Pre-Training

Aditya Vir이 [arXiv]에 게시한 'Balanced Multi-Task Attention for Satellite Image Classification: A Systematic Approach to Achieving 97.23% Accuracy on EuroSAT Without Pre-Training' 논문에 대한 자세한 리뷰입니다.
#Review#Satellite Image Classification#Multi-Task Attention#From-Scratch Training#EuroSAT Dataset#Squeeze-Excitation Networks#Coordinate Attention#CNN#Deep Learning Architecture

[논문리뷰] AsyncVoice Agent: Real-Time Explanation for LLM Planning and Reasoning

Nikos Vlassis이 [arXiv]에 게시한 'AsyncVoice Agent: Real-Time Explanation for LLM Planning and Reasoning' 논문에 대한 자세한 리뷰입니다.
#Review#Real-Time Interaction#Asynchronous Agents#LLM Explanation#Human-AI Collaboration#Voice Interface#Planning and Reasoning#Context Management#Interruption Handling

[논문리뷰] Annotation-Efficient Universal Honesty Alignment

Jingtong Wu이 [arXiv]에 게시한 'Annotation-Efficient Universal Honesty Alignment' 논문에 대한 자세한 리뷰입니다.
#Review#LLM Honesty Alignment#Confidence Calibration#Annotation Efficiency#Self-Consistency#Elicitation-Then-Calibration (EliCal)#HonestyBench#LoRA#Trustworthy AI

[논문리뷰] VISTA: A Test-Time Self-Improving Video Generation Agent

Tomas Pfister이 [arXiv]에 게시한 'VISTA: A Test-Time Self-Improving Video Generation Agent' 논문에 대한 자세한 리뷰입니다.
#Review#Text-to-Video Generation#Prompt Optimization#Multi-Agent System#Test-Time Improvement#MLLM-as-a-Judge#Video Evaluation#Audio-Video Synthesis

[논문리뷰] Robust Layerwise Scaling Rules by Proper Weight Decay Tuning

이 [arXiv]에 게시한 'Robust Layerwise Scaling Rules by Proper Weight Decay Tuning' 논문에 대한 자세한 리뷰입니다.
#Review#Weight Decay Scaling#Maximal-Update Parameterization (µP)#AdamW#Transformer#Hyperparameter Transfer#Scaling Laws#Singular Value Spectrum#Steady State Training

[논문리뷰] Rewiring Experts on the Fly:Continuous Rerouting for Better Online Adaptation in Mixture-of-Expert models

Shiwei Liu이 [arXiv]에 게시한 'Rewiring Experts on the Fly:Continuous Rerouting for Better Online Adaptation in Mixture-of-Expert models' 논문에 대한 자세한 리뷰입니다.
#Review#Mixture-of-Experts (MoE)#Online Adaptation#Test-Time Adaptation (TTA)#Expert Routing#Large Language Models (LLMs)#Self-Supervision#Computational Efficiency#Context Shift Robustness

[논문리뷰] Paper2Web: Let's Make Your Paper Alive!

Yao Wan이 [arXiv]에 게시한 'Paper2Web: Let's Make Your Paper Alive!' 논문에 대한 자세한 리뷰입니다.
#Review#Academic Webpage Generation#Multi-Agent Systems#Large Language Models#Model Context Protocol#Interactive Content#Multimedia Dissemination#Evaluation Benchmark#Human-Computer Interaction

[논문리뷰] Latent Diffusion Model without Variational Autoencoder

이 [arXiv]에 게시한 'Latent Diffusion Model without Variational Autoencoder' 논문에 대한 자세한 리뷰입니다.
#Review#Latent Diffusion Model#Variational Autoencoder#Self-supervised Learning#DINO Features#Generative Models#Image Generation#Training Efficiency#Unified Representation

[논문리뷰] Language Models Model Language

이 [arXiv]에 게시한 'Language Models Model Language' 논문에 대한 자세한 리뷰입니다.
#Review#Large Language Models#Linguistics#Witold Mańczak#Frequency Hypothesis#Empirical Validation#Usage-Based Linguistics#Semantic Embeddings

[논문리뷰] BLIP3o-NEXT: Next Frontier of Native Image Generation

이 [arXiv]에 게시한 'BLIP3o-NEXT: Next Frontier of Native Image Generation' 논문에 대한 자세한 리뷰입니다.
#Review#Image Generation#Image Editing#Autoregressive Model#Diffusion Model#Reinforcement Learning#Multimodal AI#Foundation Model#Open-source

[논문리뷰] VLA-0: Building State-of-the-Art VLAs with Zero Modification

이 [arXiv]에 게시한 'VLA-0: Building State-of-the-Art VLAs with Zero Modification' 논문에 대한 자세한 리뷰입니다.
#Review#Vision-Language-Action Models#VLA-0#Zero Modification#Text-based Action Prediction#Robot Manipulation#Large Language Models#Fine-tuning#State-of-the-Art

[논문리뷰] RealDPO: Real or Not Real, that is the Preference

Chenyang Si이 [arXiv]에 게시한 'RealDPO: Real or Not Real, that is the Preference' 논문에 대한 자세한 리뷰입니다.
#Review#Video Generation#Diffusion Models#Direct Preference Optimization#Preference Learning#Real Data#Human Motion Synthesis#RealDPO#RealAction-5K

[논문리뷰] Qwen3Guard Technical Report

이 [arXiv]에 게시한 'Qwen3Guard Technical Report' 논문에 대한 자세한 리뷰입니다.
#Review#LLM Safety#Guardrail Models#Multilingual AI#Real-time Moderation#Tri-class Classification#Instruction Tuning#Streaming Inference

[논문리뷰] On Pretraining for Project-Level Code Completion

이 [arXiv]에 게시한 'On Pretraining for Project-Level Code Completion' 논문에 대한 자세한 리뷰입니다.
#Review#Code LLMs#Project-level Context#Code Completion#Context Window Extension#RoPE Scaling#Repository Pretraining#Long Code Arena

[논문리뷰] MoM: Mixtures of Scenario-Aware Document Memories for Retrieval-Augmented Generation Systems

Feiyu Xiong이 [arXiv]에 게시한 'MoM: Mixtures of Scenario-Aware Document Memories for Retrieval-Augmented Generation Systems' 논문에 대한 자세한 리뷰입니다.
#Review#Retrieval-Augmented Generation (RAG)#Document Memory#Text Chunking#Small Language Models (SLMs)#Large Language Models (LLMs)#Scenario-Aware Processing#Multi-Layer Retrieval#Cognitive Simulation

[논문리뷰] Learning an Image Editing Model without Image Editing Pairs

이 [arXiv]에 게시한 'Learning an Image Editing Model without Image Editing Pairs' 논문에 대한 자세한 리뷰입니다.
#Review#Image Editing#Diffusion Models#Vision-Language Models (VLMs)#No-Pair Training#Few-step Generation#Distribution Matching#Gradient-based Optimization

[논문리뷰] Large Language Models Do NOT Really Know What They Don't Know

이 [arXiv]에 게시한 'Large Language Models Do NOT Really Know What They Don't Know' 논문에 대한 자세한 리뷰입니다.
#Review#LLMs#Hallucination Detection#Mechanistic Interpretability#Internal States#Knowledge Recall#Refusal Tuning#Factual Associations#Associated Hallucinations

[논문리뷰] LLM-guided Hierarchical Retrieval

이 [arXiv]에 게시한 'LLM-guided Hierarchical Retrieval' 논문에 대한 자세한 리뷰입니다.
#Review#Information Retrieval#Large Language Models#Hierarchical Retrieval#Semantic Tree#Tree Traversal#Zero-shot Performance#Reasoning-based Retrieval#Computational Efficiency

[논문리뷰] BitNet Distillation

이 [arXiv]에 게시한 'BitNet Distillation' 논문에 대한 자세한 리뷰입니다.
#Review#Low-bit Quantization#LLM Compression#Knowledge Distillation#Ternary Weights#Inference Optimization#Memory Efficiency#SubLN#Continual Pre-training

[논문리뷰] Agentic Entropy-Balanced Policy Optimization

이 [arXiv]에 게시한 'Agentic Entropy-Balanced Policy Optimization' 논문에 대한 자세한 리뷰입니다.
#Review#Agentic Reinforcement Learning#Web Agents#Tool Learning#Entropy Balancing#Policy Optimization#Rollout Strategy#Large Language Models

[논문리뷰] Revisiting Model Interpolation for Efficient Reasoning

이 [arXiv]에 게시한 'Revisiting Model Interpolation for Efficient Reasoning' 논문에 대한 자세한 리뷰입니다.
#Review#Model Interpolation#Efficient Reasoning#Large Language Models#Chain-of-Thought#Model Merging#Performance Dynamics#Ablation Study

[논문리뷰] Reasoning in Space via Grounding in the World

Li Zhang이 [arXiv]에 게시한 'Reasoning in Space via Grounding in the World' 논문에 대한 자세한 리뷰입니다.
#Review#3D Visual Grounding#Spatial Reasoning#Large Language Models (LLMs)#Chain-of-Thought (CoT)#Hybrid Representation#Multi-modal LLMs#Point Clouds

[논문리뷰] NOSA: Native and Offloadable Sparse Attention

Zhiyuan Liu이 [arXiv]에 게시한 'NOSA: Native and Offloadable Sparse Attention' 논문에 대한 자세한 리뷰입니다.
#Review#Sparse Attention#KV Cache Offloading#LLMs#Decoding Throughput#Locality Constraint#Memory Optimization#Trainable Sparse Attention

[논문리뷰] Hierarchical Frequency Tagging Probe (HFTP): A Unified Approach to Investigate Syntactic Structure Representations in Large Language Models and the Human Brain

Lingxi Lu이 [arXiv]에 게시한 'Hierarchical Frequency Tagging Probe (HFTP): A Unified Approach to Investigate Syntactic Structure Representations in Large Language Models and the Human Brain' 논문에 대한 자세한 리뷰입니다.
#Review#Large Language Models#Syntactic Structure#Human Brain#Frequency Tagging#Neuroscience#Model Interpretability#Representational Similarity Analysis#Intracranial EEG

[논문리뷰] FlashWorld: High-quality 3D Scene Generation within Seconds

Chunchao Guo이 [arXiv]에 게시한 'FlashWorld: High-quality 3D Scene Generation within Seconds' 논문에 대한 자세한 리뷰입니다.
#Review#3D Scene Generation#Diffusion Models#Multi-View Synthesis#3D Gaussian Splatting#Knowledge Distillation#Real-time Generation#High-Quality Rendering#Cross-modal Training

[논문리뷰] FG-CLIP 2: A Bilingual Fine-grained Vision-Language Alignment Model

Dawei Liang이 [arXiv]에 게시한 'FG-CLIP 2: A Bilingual Fine-grained Vision-Language Alignment Model' 논문에 대한 자세한 리뷰입니다.
#Review#Vision-Language Alignment#Fine-grained Understanding#Bilingual Model#Contrastive Learning#Multimodal Retrieval#Open-Vocabulary Detection#Region-Text Matching

[논문리뷰] Direct Multi-Token Decoding

Xifeng Yan이 [arXiv]에 게시한 'Direct Multi-Token Decoding' 논문에 대한 자세한 리뷰입니다.
#Review#LLM Inference#Multi-token Decoding#Transformer Architecture#Layer Specialization#Cyclical Refilling#Inference Speedup#Model Scaling

[논문리뷰] What If : Understanding Motion Through Sparse Interactions

이 [arXiv]에 게시한 'What If : Understanding Motion Through Sparse Interactions' 논문에 대한 자세한 리뷰입니다.
#Review#Motion Understanding#Sparse Interactions#Multimodal Prediction#Flow Poke Transformer#Physical Scene Dynamics#Uncertainty Quantification#Generative Models#Computer Vision

[논문리뷰] ViCO: A Training Strategy towards Semantic Aware Dynamic High-Resolution

이 [arXiv]에 게시한 'ViCO: A Training Strategy towards Semantic Aware Dynamic High-Resolution' 논문에 대한 자세한 리뷰입니다.
#Review#Multimodal Large Language Models (MLLMs)#Dynamic Resolution#Token Compression#Semantic Awareness#Visual Consistency Learning (ViCO)#Visual Resolution Router (ViR)#Inference Optimization

[논문리뷰] Tensor Logic: The Language of AI

Pedro Domingos이 [arXiv]에 게시한 'Tensor Logic: The Language of AI' 논문에 대한 자세한 리뷰입니다.
#Review#Tensor Logic#Neurosymbolic AI#Logic Programming#Tensor Algebra#Deep Learning#Automated Reasoning#Embedding Space

[논문리뷰] Robot Learning: A Tutorial

이 [arXiv]에 게시한 'Robot Learning: A Tutorial' 논문에 대한 자세한 리뷰입니다.
#Review#Robot Learning#Reinforcement Learning#Imitation Learning#Behavioral Cloning#Vision-Language-Action Models#Diffusion Models#Transformers#LeRobot

[논문리뷰] HoneyBee: Data Recipes for Vision-Language Reasoners

이 [arXiv]에 게시한 'HoneyBee: Data Recipes for Vision-Language Reasoners' 논문에 대한 자세한 리뷰입니다.
#Review#Vision-Language Models#Data Curation#Chain-of-Thought#VL Reasoning#Dataset Scaling#Supervised Finetuning#HONEYBEE#Test-Time Scaling

[논문리뷰] ExpVid: A Benchmark for Experiment Video Understanding & Reasoning

이 [arXiv]에 게시한 'ExpVid: A Benchmark for Experiment Video Understanding & Reasoning' 논문에 대한 자세한 리뷰입니다.
#Review#Experiment Video Understanding#Multimodal Large Language Models (MLLMs)#Scientific Reasoning#Benchmark#Wet-Lab Experiments#Procedural Understanding#Fine-grained Perception#Video QA

[논문리뷰] Dr.LLM: Dynamic Layer Routing in LLMs

이 [arXiv]에 게시한 'Dr.LLM: Dynamic Layer Routing in LLMs' 논문에 대한 자세한 리뷰입니다.
#Review#Dynamic Routing#LLMs#Adaptive Depth#Computational Efficiency#Monte Carlo Tree Search (MCTS)#Retrofittable Framework#Supervised Learning#Accuracy Improvement

[논문리뷰] Detect Anything via Next Point Prediction

이 [arXiv]에 게시한 'Detect Anything via Next Point Prediction' 논문에 대한 자세한 리뷰입니다.
#Review#Multimodal Large Language Models#Object Detection#Coordinate Prediction#Reinforcement Learning#Supervised Fine-tuning#Visual Perception#Zero-shot Learning#Spatial Reasoning

[논문리뷰] A Survey of Vibe Coding with Large Language Models

이 [arXiv]에 게시한 'A Survey of Vibe Coding with Large Language Models' 논문에 대한 자세한 리뷰입니다.
#Review#Vibe Coding#Large Language Models#Coding Agents#Human-AI Collaboration#Software Engineering#Development Models#Context Engineering

[논문리뷰] Which Heads Matter for Reasoning? RL-Guided KV Cache Compression

Huan Wang이 [arXiv]에 게시한 'Which Heads Matter for Reasoning? RL-Guided KV Cache Compression' 논문에 대한 자세한 리뷰입니다.
#Review#KV Cache Compression#Large Language Models (LLMs)#Reinforcement Learning (RL)#Reasoning Models#Attention Heads#Chain-of-Thought (CoT)#Memory Efficiency

[논문리뷰] Understanding DeepResearch via Reports

Chengen Huang이 [arXiv]에 게시한 'Understanding DeepResearch via Reports' 논문에 대한 자세한 리뷰입니다.
#Review#DeepResearch Agents#LLM-as-a-Judge#Report Evaluation#Agentic AI#Factuality#Redundancy#Research Automation#Benchmark

[논문리뷰] PhysToolBench: Benchmarking Physical Tool Understanding for MLLMs

Xu Zheng이 [arXiv]에 게시한 'PhysToolBench: Benchmarking Physical Tool Understanding for MLLMs' 논문에 대한 자세한 리뷰입니다.
#Review#Multimodal Large Language Models (MLLMs)#Physical Tool Understanding#Benchmarking#Embodied AI#Visual Question Answering (VQA)#Tool Affordances#Reasoning

[논문리뷰] Parallel Test-Time Scaling for Latent Reasoning Models

이 [arXiv]에 게시한 'Parallel Test-Time Scaling for Latent Reasoning Models' 논문에 대한 자세한 리뷰입니다.
#Review#Latent Reasoning#Test-Time Scaling#Parallel Inference#Stochastic Sampling#Monte Carlo Dropout#Additive Gaussian Noise#Latent Reward Model#Trajectory Aggregation

[논문리뷰] Mitigating Overthinking through Reasoning Shaping

Wen Luo이 [arXiv]에 게시한 'Mitigating Overthinking through Reasoning Shaping' 논문에 대한 자세한 리뷰입니다.
#Review#Large Reasoning Models (LRMs)#RLVR#Overthinking Mitigation#Reasoning Shaping#Segment-level Penalization#Computational Efficiency#Training Stability#Length-aware Weighting

[논문리뷰] KORMo: Korean Open Reasoning Model for Everyone

이 [arXiv]에 게시한 'KORMo: Korean Open Reasoning Model for Everyone' 논문에 대한 자세한 리뷰입니다.
#Review#Large Language Model#Korean#Bilingual#Synthetic Data#Fully Open Model#Tokenizer#Reasoning#Pretraining#Instruction Tuning

[논문리뷰] Instant4D: 4D Gaussian Splatting in Minutes

Li Lu이 [arXiv]에 게시한 'Instant4D: 4D Gaussian Splatting in Minutes' 논문에 대한 자세한 리뷰입니다.
#Review#4D Gaussian Splatting#Dynamic View Synthesis#Monocular Reconstruction#Visual SLAM#Grid Pruning#Real-time Rendering#GPU Memory Optimization

[논문리뷰] Hybrid-grained Feature Aggregation with Coarse-to-fine Language Guidance for Self-supervised Monocular Depth Estimation

Zekun Qi이 [arXiv]에 게시한 'Hybrid-grained Feature Aggregation with Coarse-to-fine Language Guidance for Self-supervised Monocular Depth Estimation' 논문에 대한 자세한 리뷰입니다.
#Review#Self-supervised Monocular Depth Estimation#Foundation Models#CLIP#DINO#Language Guidance#Coarse-to-fine Learning#Feature Aggregation#3D Perception

[논문리뷰] AutoPR: Let's Automate Your Academic Promotion!

Yixin Yuan이 [arXiv]에 게시한 'AutoPR: Let's Automate Your Academic Promotion!' 논문에 대한 자세한 리뷰입니다.
#Review#Academic Promotion#Large Language Models#Multi-Agent Systems#Scholarly Communication#Multimodal Processing#Benchmark#Content Generation#Social Media Marketing

[논문리뷰] A Goal Without a Plan Is Just a Wish: Efficient and Effective Global Planner Training for Long-Horizon Agent Tasks

Fanchao Qi이 [arXiv]에 게시한 'A Goal Without a Plan Is Just a Wish: Efficient and Effective Global Planner Training for Long-Horizon Agent Tasks' 논문에 대한 자세한 리뷰입니다.
#Review#Long-Horizon Tasks#LLM Agents#Global Planning#Reinforcement Learning#Supervised Fine-tuning#Homologous Consensus Filtering#Executor Capability Gain Reward#Plan-and-Execute

[논문리뷰] Training-Free Group Relative Policy Optimization

이 [arXiv]에 게시한 'Training-Free Group Relative Policy Optimization' 논문에 대한 자세한 리뷰입니다.
#Review#LLM Agents#Reinforcement Learning#Parameter-Free Optimization#Experiential Knowledge#Token Prior#Group Relative Policy Optimization#In-Context Learning#Cost-Effective AI

[논문리뷰] Towards Scalable and Consistent 3D Editing

Pan Zhou이 [arXiv]에 게시한 'Towards Scalable and Consistent 3D Editing' 논문에 대한 자세한 리뷰입니다.
#Review#3D Editing#Generative Models#Transformer Architecture#Dataset Generation#Multimodal Learning#Conditional Generation#Image-to-3D

[논문리뷰] SViM3D: Stable Video Material Diffusion for Single Image 3D Generation

이 [arXiv]에 게시한 'SViM3D: Stable Video Material Diffusion for Single Image 3D Generation' 논문에 대한 자세한 리뷰입니다.
#Review#Single Image 3D Reconstruction#Material Prediction#Video Diffusion Models#Physically Based Rendering (PBR)#Inverse Rendering#Novel View Synthesis#Camera Control#Latent Diffusion

[논문리뷰] MemMamba: Rethinking Memory Patterns in State Space Model

Xiao Sun이 [arXiv]에 게시한 'MemMamba: Rethinking Memory Patterns in State Space Model' 논문에 대한 자세한 리뷰입니다.
#Review#State Space Models#Mamba#Long-sequence modeling#Memory decay#State summarization#Cross-layer attention#Perplexity#Linear complexity

[논문리뷰] MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization

vanilla1116이 [arXiv]에 게시한 'MM-HELIX: Boosting Multimodal Long-Chain Reflective Reasoning with Holistic Platform and Adaptive Hybrid Policy Optimization' 논문에 대한 자세한 리뷰입니다.
#Review#Multimodal LLMs#Reflective Reasoning#Long-Chain Reasoning#Benchmark#Policy Optimization#Data Generation#Reinforcement Learning#Backtracking

[논문리뷰] GCPO: When Contrast Fails, Go Gold

이 [arXiv]에 게시한 'GCPO: When Contrast Fails, Go Gold' 논문에 대한 자세한 리뷰입니다.
#Review#Reinforcement Learning#LLMs Reasoning#Policy Optimization#Contrastive Learning#Chain of Thought#Reference Answers#Math Reasoning#Gold-Standard Answer

[논문리뷰] Fidelity-Aware Data Composition for Robust Robot Generalization

Liliang Chen이 [arXiv]에 게시한 'Fidelity-Aware Data Composition for Robust Robot Generalization' 논문에 대한 자세한 리뷰입니다.
#Review#Robot Generalization#Data Augmentation#Out-of-Distribution (OOD)#Shortcut Learning#Information Fidelity#Data Composition#Diffusion Models#Multi-View Video Synthesis

[논문리뷰] Entropy Regularizing Activation: Boosting Continuous Control, Large Language Models, and Image Classification with Activation as Entropy Constraints

Huazhe Xu이 [arXiv]에 게시한 'Entropy Regularizing Activation: Boosting Continuous Control, Large Language Models, and Image Classification with Activation as Entropy Constraints' 논문에 대한 자세한 리뷰입니다.
#Review#Entropy Regularization#Activation Functions#Continuous Control#Large Language Models#Image Classification#Reinforcement Learning#Policy Stochasticity#Entropy Constraints

[논문리뷰] Beyond Outliers: A Study of Optimizers Under Quantization

이 [arXiv]에 게시한 'Beyond Outliers: A Study of Optimizers Under Quantization' 논문에 대한 자세한 리뷰입니다.
#Review#Quantization#Optimizers#LLM#Post-Training Quantization (PTQ)#Quantization-Aware Training (QAT)#Error Propagation#Scaling Laws#Shampoo

[논문리뷰] Agent Learning via Early Experience

이 [arXiv]에 게시한 'Agent Learning via Early Experience' 논문에 대한 자세한 리뷰입니다.
#Review#Language Agents#Early Experience#Reward-Free Learning#World Modeling#Self-Reflection#Imitation Learning#Reinforcement Learning#Out-of-Domain Generalization

[논문리뷰] The Markovian Thinker

이 [arXiv]에 게시한 'The Markovian Thinker' 논문에 대한 자세한 리뷰입니다.
#Review#Reinforcement Learning#Large Language Models#Chain-of-Thought#Markovian Thinking#Context Management#Computational Efficiency#Long-Context LLMs#Transformer Optimization

[논문리뷰] TTRV: Test-Time Reinforcement Learning for Vision Language Models

Serena Yeung-Levy이 [arXiv]에 게시한 'TTRV: Test-Time Reinforcement Learning for Vision Language Models' 논문에 대한 자세한 리뷰입니다.
#Review#Vision-Language Models (VLMs)#Reinforcement Learning (RL)#Test-Time Adaptation#Unsupervised Learning#Image Recognition#Visual Question Answering (VQA)#Group Relative Policy Optimization (GRPO)#Entropy Regularization

[논문리뷰] Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs

Jingyi Liao이 [arXiv]에 게시한 'Patch-as-Decodable-Token: Towards Unified Multi-Modal Vision Tasks in MLLMs' 논문에 대한 자세한 리뷰입니다.
#Review#Multimodal Large Language Models (MLLMs)#Visual Reference Tokens (VRTs)#Dense Prediction#Referring Expression Comprehension (REC)#Open-Vocabulary Detection (OVD)#Image Captioning#Unified Architecture#Autoregressive Generation

[논문리뷰] Online Generic Event Boundary Detection

Jonghyun Choi이 [arXiv]에 게시한 'Online Generic Event Boundary Detection' 논문에 대한 자세한 리뷰입니다.
#Review#Online Video Analysis#Event Boundary Detection#Event Segmentation Theory#Real-time AI#Anomaly Detection#Transformer Architecture

[논문리뷰] NorMuon: Making Muon more efficient and scalable

Tuo Zhao이 [arXiv]에 게시한 'NorMuon: Making Muon more efficient and scalable' 논문에 대한 자세한 리뷰입니다.
#Review#LLM Training#Optimizer#Muon#Orthogonalization#Adaptive Learning Rates#Distributed Training#FSDP2#NorMuon

[논문리뷰] Native Hybrid Attention for Efficient Sequence Modeling

Yu Cheng이 [arXiv]에 게시한 'Native Hybrid Attention for Efficient Sequence Modeling' 논문에 대한 자세한 리뷰입니다.
#Review#Sequence Modeling#Hybrid Attention#Transformer Architecture#Linear Attention#Sliding Window Attention#Long Context#Large Language Models (LLMs)#Efficiency

[논문리뷰] Multi-Agent Tool-Integrated Policy Optimization

Lidong Bing이 [arXiv]에 게시한 'Multi-Agent Tool-Integrated Policy Optimization' 논문에 대한 자세한 리뷰입니다.
#Review#Multi-Agent RL#Tool-Integrated Planning#Large Language Models (LLMs)#Policy Optimization#Credit Assignment#Reinforcement Learning#MATPO

[논문리뷰] MATRIX: Mask Track Alignment for Interaction-aware Video Generation

Hyunwook Choi이 [arXiv]에 게시한 'MATRIX: Mask Track Alignment for Interaction-aware Video Generation' 논문에 대한 자세한 리뷰입니다.
#Review#Video Generation#Diffusion Transformers#Human-Object Interaction#Attention Alignment#Mask Tracking#Semantic Grounding#Semantic Propagation#Text-to-Video

[논문리뷰] Heptapod: Language Modeling on Visual Signals

이 [arXiv]에 게시한 'Heptapod: Language Modeling on Visual Signals' 논문에 대한 자세한 리뷰입니다.
#Review#Autoregressive Models#Image Generation#Language Modeling#Causal Transformer#2D Distribution Prediction#Visual Tokenization#Self-Supervised Learning#Generative Models

[논문리뷰] G^2RPO: Granular GRPO for Precise Reward in Flow Models

이 [arXiv]에 게시한 'G^2RPO: Granular GRPO for Precise Reward in Flow Models' 논문에 대한 자세한 리뷰입니다.
#Review#Reinforcement Learning#Flow Models#Generative Models#Human Preference Alignment#Stochastic Differential Equations (SDE)#Reward Signal#Multi-Granularity

[논문리뷰] Bridging Text and Video Generation: A Survey

G. Maragatham이 [arXiv]에 게시한 'Bridging Text and Video Generation: A Survey' 논문에 대한 자세한 리뷰입니다.
#Review#Text-to-Video Generation#Generative Models#Diffusion Models#GANs#VAEs#Video Synthesis#Survey#Evaluation Metrics

[논문리뷰] Training Dynamics Impact Post-Training Quantization Robustness

Jonas Geiping이 [arXiv]에 게시한 'Training Dynamics Impact Post-Training Quantization Robustness' 논문에 대한 자세한 리뷰입니다.
#Review#Post-Training Quantization#Quantization Robustness#Training Dynamics#Learning Rate Schedules#Weight Averaging#Large Language Models#LLMs#Hyperparameter Tuning

[논문리뷰] ShapeGen4D: Towards High Quality 4D Shape Generation from Videos

Sergey Tulyakov이 [arXiv]에 게시한 'ShapeGen4D: Towards High Quality 4D Shape Generation from Videos' 논문에 대한 자세한 리뷰입니다.
#Review#4D Shape Generation#Video-conditioned#Dynamic 3D Meshes#Latent Diffusion Model#Spatiotemporal Attention#Temporal Consistency#Pre-trained 3D Models#VAE

[논문리뷰] Revisiting Modeling and Evaluation Approaches in Speech Emotion Recognition: Considering Subjectivity of Annotators and Ambiguity of Emotions

이 [arXiv]에 게시한 'Revisiting Modeling and Evaluation Approaches in Speech Emotion Recognition: Considering Subjectivity of Annotators and Ambiguity of Emotions' 논문에 대한 자세한 리뷰입니다.
#Review#Speech Emotion Recognition#Annotator Subjectivity#Emotion Ambiguity#Soft Labels#Multi-label Classification#Evaluation Metrics#Loss Functions

[논문리뷰] MixReasoning: Switching Modes to Think

이 [arXiv]에 게시한 'MixReasoning: Switching Modes to Think' 논문에 대한 자세한 리뷰입니다.
#Review#LLM Reasoning#Chain-of-Thought#Efficiency#LoRA#Adaptive Reasoning#Token Uncertainty#Dynamic Switching#Reasoning Compression

[논문리뷰] Less is More: Recursive Reasoning with Tiny Networks

이 [arXiv]에 게시한 'Less is More: Recursive Reasoning with Tiny Networks' 논문에 대한 자세한 리뷰입니다.
#Review#Recursive Reasoning#Tiny Networks#Deep Supervision#Hierarchical Reasoning Model (HRM)#Sudoku-Extreme#ARC-AGI#Generalization#Parameter Efficiency

[논문리뷰] Human3R: Everyone Everywhere All at Once

Yuliang Xiu이 [arXiv]에 게시한 'Human3R: Everyone Everywhere All at Once' 논문에 대한 자세한 리뷰입니다.
#Review#4D Human-Scene Reconstruction#Online Reconstruction#Multi-person#SMPL-X#Transformer#Visual Prompt Tuning#Real-time#Foundation Model

[논문리뷰] Fast-dLLM v2: Efficient Block-Diffusion LLM

이 [arXiv]에 게시한 'Fast-dLLM v2: Efficient Block-Diffusion LLM' 논문에 대한 자세한 리뷰입니다.
#Review#Diffusion LLMs#Inference Acceleration#Parallel Decoding#Autoregressive Models#Caching#Fine-tuning#Block-wise Attention

[논문리뷰] Drax: Speech Recognition with Discrete Flow Matching

이 [arXiv]에 게시한 'Drax: Speech Recognition with Discrete Flow Matching' 논문에 대한 자세한 리뷰입니다.
#Review#Automatic Speech Recognition (ASR)#Discrete Flow Matching (DFM)#Non-Autoregressive (NAR)#Generative Models#Tri-mixture Probability Path#Parallel Decoding#Accuracy-Efficiency Trade-off#Speech Synthesis

[논문리뷰] CoDA: Coding LM via Diffusion Adaptation

이 [arXiv]에 게시한 'CoDA: Coding LM via Diffusion Adaptation' 논문에 대한 자세한 리뷰입니다.
#Review#Diffusion Language Models#Code Generation#Bidirectional Decoding#Text Infilling#Instruction Tuning#Lightweight Models#TPU Training

[논문리뷰] ASPO: Asymmetric Importance Sampling Policy Optimization

Xiu Li이 [arXiv]에 게시한 'ASPO: Asymmetric Importance Sampling Policy Optimization' 논문에 대한 자세한 리뷰입니다.
#Review#Reinforcement Learning#Large Language Models#Importance Sampling#Policy Optimization#PPO-Clip#Outcome-Supervised RL#Token Weighting#GRPO

[논문리뷰] Watch and Learn: Learning to Use Computers from Online Videos

Oriana Riva이 [arXiv]에 게시한 'Watch and Learn: Learning to Use Computers from Online Videos' 논문에 대한 자세한 리뷰입니다.
#Review#Computer Use Agents#Inverse Dynamics Model#UI Trajectories#Web Videos#In-Context Learning#Supervised Fine-Tuning#Large Language Models#OSWorld Benchmark

[논문리뷰] Utility-Learning Tension in Self-Modifying Agents

Peter Jin이 [arXiv]에 게시한 'Utility-Learning Tension in Self-Modifying Agents' 논문에 대한 자세한 리뷰입니다.
#Review#Self-Modifying Agents#PAC Learnability#VC Dimension#Capacity Bounds#Metacognition#Architectural Search#Algorithmic Stability#Generalization Theory

[논문리뷰] Thai Semantic End-of-Turn Detection for Real-Time Voice Agents

Monthol Charattrakool이 [arXiv]에 게시한 'Thai Semantic End-of-Turn Detection for Real-Time Voice Agents' 논문에 대한 자세한 리뷰입니다.
#Review#End-of-Turn Detection#Thai NLP#Voice Agents#Real-time Inference#Transformer Models#Few-shot Learning#Fine-tuning#Latency Optimization

[논문리뷰] Self-Reflective Generation at Test Time

Shuang Qiu이 [arXiv]에 게시한 'Self-Reflective Generation at Test Time' 논문에 대한 자세한 리뷰입니다.
#Review#Large Language Models#Self-Reflection#Test-Time Optimization#Uncertainty Monitoring#Proactive Error Prevention#Reasoning Tasks#Chain-of-Thought

[논문리뷰] Optimal Scaling Needs Optimal Norm

Stefan Kesselheim이 [arXiv]에 게시한 'Optimal Scaling Needs Optimal Norm' 논문에 대한 자세한 리뷰입니다.
#Review#Optimal Scaling#Norm-Based Optimizers#Hyperparameter Transfer#Learning Rate Scaling#Batch Size Scaling#Transformer Models#Scion Optimizer#Large Language Models

[논문리뷰] Imperceptible Jailbreaking against Large Language Models

이 [arXiv]에 게시한 'Imperceptible Jailbreaking against Large Language Models' 논문에 대한 자세한 리뷰입니다.
#Review#Large Language Models#Jailbreaking#Imperceptible Attacks#Unicode Variation Selectors#Adversarial Suffixes#Safety Alignment#Prompt Injection

[논문리뷰] Character Mixing for Video Generation

이 [arXiv]에 게시한 'Character Mixing for Video Generation' 논문에 대한 자세한 리뷰입니다.
#Review#Video Generation#Character Mixing#Style Preservation#Multi-character Interaction#Text-to-Video#Cross-Domain Synthesis#Identity Preservation

[논문리뷰] Self-Improvement in Multimodal Large Language Models: A Survey

Yapeng Tian이 [arXiv]에 게시한 'Self-Improvement in Multimodal Large Language Models: A Survey' 논문에 대한 자세한 리뷰입니다.
#Review#Multimodal Large Language Models (MLLMs)#Self-Improvement#Data Collection#Data Organization#Model Optimization#Survey#Reinforcement Learning#Direct Preference Optimization

[논문리뷰] OrtSAE: Orthogonal Sparse Autoencoders Uncover Atomic Features

Elena Tutubalina이 [arXiv]에 게시한 'OrtSAE: Orthogonal Sparse Autoencoders Uncover Atomic Features' 논문에 대한 자세한 리뷰입니다.
#Review#Sparse Autoencoders#Mechanistic Interpretability#Feature Disentanglement#Orthogonality#LLM Features#Feature Absorption#Feature Composition

[논문리뷰] Apriel-1.5-15b-Thinker

이 [arXiv]에 게시한 'Apriel-1.5-15b-Thinker' 논문에 대한 자세한 리뷰입니다.
#Review#Multimodal Reasoning Model#Open-Weights Model#Continual Pretraining (CPT)#Supervised Fine-Tuning (SFT)#Training Design#Efficiency#Frontier Performance

[논문리뷰] A Practitioner's Guide to Multi-turn Agentic Reinforcement Learning

이 [arXiv]에 게시한 'A Practitioner's Guide to Multi-turn Agentic Reinforcement Learning' 논문에 대한 자세한 리뷰입니다.
#Review#Multi-turn Reinforcement Learning#LLM Agents#Text-based Environments#Reward Shaping#Policy Optimization#Supervised Fine-tuning (SFT)#Generalization#Environment Complexity

[논문리뷰] Training Vision-Language Process Reward Models for Test-Time Scaling in Multimodal Reasoning: Key Insights and Lessons Learned

이 [arXiv]에 게시한 'Training Vision-Language Process Reward Models for Test-Time Scaling in Multimodal Reasoning: Key Insights and Lessons Learned' 논문에 대한 자세한 리뷰입니다.
#Review#Vision-Language Models (VLMs)#Process Reward Models (PRMs)#Multimodal Reasoning#Test-Time Scaling (TTS)#Process Supervision#Dataset Construction#Perception Errors#MCTS

[논문리뷰] Making, not Taking, the Best of N

이 [arXiv]에 게시한 'Making, not Taking, the Best of N' 논문에 대한 자세한 리뷰입니다.
#Review#LLM Aggregation#Generative Fusion#Best-of-N#Synthetic Data Generation#Test-Time Scaling#Multilingual Models#Ensemble Learning

[논문리뷰] Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation

이 [arXiv]에 게시한 'Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation' 논문에 대한 자세한 리뷰입니다.
#Review#Large Language Models (LLMs)#Reinforcement Learning (RL)#Exploration Budget Allocation#Knapsack Problem#Group Relative Policy Optimization (GRPO)#Mathematical Reasoning#Resource Optimization

[논문리뷰] JoyAgent-JDGenie: Technical Report on the GAIA

이 [arXiv]에 게시한 'JoyAgent-JDGenie: Technical Report on the GAIA' 논문에 대한 자세한 리뷰입니다.
#Review#Generalist Agent#Multi-Agent System#Plan-Execute#ReAct#Hierarchical Memory#Tool Integration#GAIA Benchmark#LLM Agent

[논문리뷰] Infusing Theory of Mind into Socially Intelligent LLM Agents

이 [arXiv]에 게시한 'Infusing Theory of Mind into Socially Intelligent LLM Agents' 논문에 대한 자세한 리뷰입니다.
#Review#Theory of Mind#Large Language Models#Social Agents#Dialogue Systems#Mental State Modeling#Look-ahead Planning#Supervised Fine-tuning#Sotopia Benchmark

[논문리뷰] GEM: A Gym for Agentic LLMs

이 [arXiv]에 게시한 'GEM: A Gym for Agentic LLMs' 논문에 대한 자세한 리뷰입니다.
#Review#Agentic LLMs#Reinforcement Learning#Environment Simulator#Multi-turn Interactions#Return Batch Normalization#Tool Integration#Benchmarking

[논문리뷰] Eliciting Secret Knowledge from Language Models

Neel Nanda이 [arXiv]에 게시한 'Eliciting Secret Knowledge from Language Models' 논문에 대한 자세한 리뷰입니다.
#Review#Language Models#Secret Elicitation#Mechanistic Interpretability#Black-box Methods#White-box Methods#AI Auditing#Model Organisms#Prefill Attacks

[논문리뷰] DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search

이 [arXiv]에 게시한 'DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search' 논문에 대한 자세한 리뷰입니다.
#Review#Reinforcement Learning with Verifiable Rewards (RLVR)#Monte Carlo Tree Search (MCTS)#Mathematical Reasoning#Large Language Models (LLMs)#Systematic Exploration#Adaptive Training#Tree-GRPO

[논문리뷰] Boolean Satisfiability via Imitation Learning

Xiangyu Xu이 [arXiv]에 게시한 'Boolean Satisfiability via Imitation Learning' 논문에 대한 자세한 리뷰입니다.
#Review#Boolean Satisfiability#Imitation Learning#CDCL Solvers#Branching Policy#KeyTrace#Transformer Architecture#Perceiver AR

[논문리뷰] Beyond Log Likelihood: Probability-Based Objectives for Supervised Fine-Tuning across the Model Capability Continuum

Hanghang Tong이 [arXiv]에 게시한 'Beyond Log Likelihood: Probability-Based Objectives for Supervised Fine-Tuning across the Model Capability Continuum' 논문에 대한 자세한 리뷰입니다.
#Review#Supervised Fine-tuning (SFT)#Large Language Models (LLMs)#Training Objectives#Negative Log Likelihood (NLL)#Model Capability Continuum#Generalization#Probability-based Loss Functions

[논문리뷰] dParallel: Learnable Parallel Decoding for dLLMs

이 [arXiv]에 게시한 'dParallel: Learnable Parallel Decoding for dLLMs' 논문에 대한 자세한 리뷰입니다.
#Review#Diffusion Language Models#Parallel Decoding#Inference Acceleration#Certainty Distillation#Self-Distillation#Masked Language Models#LLaDA

[논문리뷰] Who invented deep residual learning?

Juergen Schmidhuber이 [arXiv]에 게시한 'Who invented deep residual learning?' 논문에 대한 자세한 리뷰입니다.
#Review#Deep Learning History#Residual Connections#Recurrent Neural Networks (RNN)#Long Short-Term Memory (LSTM)#Feedforward Neural Networks (FNN)#Highway Networks#ResNet#Vanishing Gradient

[논문리뷰] TTT3R: 3D Reconstruction as Test-Time Training

Anpei Chen이 [arXiv]에 게시한 'TTT3R: 3D Reconstruction as Test-Time Training' 논문에 대한 자세한 리뷰입니다.
#Review#3D Reconstruction#Test-Time Training (TTT)#Recurrent Neural Networks (RNN)#Online Learning#Length Generalization#Associative Memory#State Update Rule

[논문리뷰] Regression Language Models for Code

이 [arXiv]에 게시한 'Regression Language Models for Code' 논문에 대한 자세한 리뷰입니다.
#Review#Regression Language Model#Code Performance Prediction#Static Analysis#Neural Architecture Search#Text-to-Text Regression#Multi-task Learning#T5Gemma#ONNX

[논문리뷰] OceanGym: A Benchmark Environment for Underwater Embodied Agents

이 [arXiv]에 게시한 'OceanGym: A Benchmark Environment for Underwater Embodied Agents' 논문에 대한 자세한 리뷰입니다.
#Review#Underwater Robotics#Embodied AI#Benchmark Environment#Multi-modal Large Language Models#Autonomous Underwater Vehicles#Perception#Decision-Making#Simulation

[논문리뷰] MotionRAG: Motion Retrieval-Augmented Image-to-Video Generation

Limin Wang이 [arXiv]에 게시한 'MotionRAG: Motion Retrieval-Augmented Image-to-Video Generation' 논문에 대한 자세한 리뷰입니다.
#Review#Image-to-Video Generation#Motion Transfer#Retrieval-Augmented Generation (RAG)#In-Context Learning#Diffusion Models#Video Diffusion#Motion Realism

[논문리뷰] LayerD: Decomposing Raster Graphic Designs into Layers

Kota Yamaguchi이 [arXiv]에 게시한 'LayerD: Decomposing Raster Graphic Designs into Layers' 논문에 대한 자세한 리뷰입니다.
#Review#Graphic Design#Image Decomposition#Layer Extraction#Image Matting#Background Completion#Deep Learning#Creative AI#Dynamic Time Warping

[논문리뷰] Knowledge Homophily in Large Language Models

Nedim Lipka이 [arXiv]에 게시한 'Knowledge Homophily in Large Language Models' 논문에 대한 자세한 리뷰입니다.
#Review#LLM#Knowledge Homophily#Graph Neural Networks#Knowledge Graph#Knowledge Injection#Question Answering#Fine-tuning#Knowledge Retrieval

[논문리뷰] Humanline: Online Alignment as Perceptual Loss

이 [arXiv]에 게시한 'Humanline: Online Alignment as Perceptual Loss' 논문에 대한 자세한 리뷰입니다.
#Review#LLM Alignment#Online RLHF#Offline RLHF#Prospect Theory#Perceptual Loss#Human-Centric AI#Reinforcement Learning

[논문리뷰] DA^2: Depth Anything in Any Direction

이 [arXiv]에 게시한 'DA^2: Depth Anything in Any Direction' 논문에 대한 자세한 리뷰입니다.
#Review#Panoramic Depth Estimation#Zero-shot Generalization#Data Curation#SphereViT#Spherical Geometry#360-degree Imaging#Vision Transformer

[논문리뷰] A Cartography of Open Collaboration in Open Source AI: Mapping Practices, Motivations, and Governance in 14 Open Large Language Model Projects

Jennifer Ding이 [arXiv]에 게시한 'A Cartography of Open Collaboration in Open Source AI: Mapping Practices, Motivations, and Governance in 14 Open Large Language Model Projects' 논문에 대한 자세한 리뷰입니다.
#Review#Open Source AI#LLM Development#Open Collaboration#Governance Models#Developer Motivations#Community Engagement#AI Ecosystem

[논문리뷰] Visual Jigsaw Post-Training Improves MLLMs

Lewei Lu이 [arXiv]에 게시한 'Visual Jigsaw Post-Training Improves MLLMs' 논문에 대한 자세한 리뷰입니다.
#Review#MLLMs#Post-training#Self-supervised Learning#Visual Understanding#Jigsaw Puzzles#RLVR#Multimodal Perception#Spatial Reasoning

[논문리뷰] Multiplayer Nash Preference Optimization

이 [arXiv]에 게시한 'Multiplayer Nash Preference Optimization' 논문에 대한 자세한 리뷰입니다.
#Review#RLHF#LLM Alignment#Nash Equilibrium#Multiplayer Games#Preference Optimization#Non-transitive Preferences#Game Theory

[논문리뷰] Variational Reasoning for Language Models

이 [arXiv]에 게시한 'Variational Reasoning for Language Models' 논문에 대한 자세한 리뷰입니다.
#Review#Variational Inference#Language Models#Reasoning#ELBO#IWAE#Reinforcement Learning#Latent Variables#Forward-KL

[논문리뷰] RefAM: Attention Magnets for Zero-Shot Referral Segmentation

Federico Tombari이 [arXiv]에 게시한 'RefAM: Attention Magnets for Zero-Shot Referral Segmentation' 논문에 대한 자세한 리뷰입니다.
#Review#Zero-Shot Segmentation#Referring Segmentation#Diffusion Transformers (DiTs)#Attention Mechanisms#Attention Sinks#Stop Words#Vision-Language Models#Training-Free Methods

[논문리뷰] Real-Time Object Detection Meets DINOv3

Xi Shen이 [arXiv]에 게시한 'Real-Time Object Detection Meets DINOv3' 논문에 대한 자세한 리뷰입니다.
#Review#Real-time Object Detection#DINOv3#DEIMv2#Vision Transformer#Multi-scale Features#Spatial Tuning Adapter#Lightweight Models#Object Detection Framework

[논문리뷰] LongLive: Real-time Interactive Long Video Generation

이 [arXiv]에 게시한 'LongLive: Real-time Interactive Long Video Generation' 논문에 대한 자세한 리뷰입니다.
#Review#Long Video Generation#Real-time#Interactive AI#Autoregressive Models#KV Cache#Streaming Tuning#Attention Sink#Diffusion Models

[논문리뷰] Fine-tuning Done Right in Model Editing

Du Su이 [arXiv]에 게시한 'Fine-tuning Done Right in Model Editing' 논문에 대한 자세한 리뷰입니다.
#Review#Model Editing#Fine-tuning#Large Language Models#Catastrophic Forgetting#Breadth-First Pipeline#Depth-First Pipeline#Localized Tuning#Lifelong Learning

[논문리뷰] D-Artemis: A Deliberative Cognitive Framework for Mobile GUI Multi-Agents

Jinyuan Li이 [arXiv]에 게시한 'D-Artemis: A Deliberative Cognitive Framework for Mobile GUI Multi-Agents' 논문에 대한 자세한 리뷰입니다.
#Review#Mobile GUI Automation#Multi-Agent System#Cognitive Architecture#Pre-execution Alignment#Post-execution Reflection#Retrieval-Augmented Generation#Multimodal LLM#Deliberative AI

[논문리뷰] CHURRO: Making History Readable with an Open-Weight Large Vision-Language Model for High-Accuracy, Low-Cost Historical Text Recognition

이 [arXiv]에 게시한 'CHURRO: Making History Readable with an Open-Weight Large Vision-Language Model for High-Accuracy, Low-Cost Historical Text Recognition' 논문에 대한 자세한 리뷰입니다.
#Review#Historical Text Recognition#Vision-Language Model#Open-Weight Model#OCR#Cultural Heritage#Low-Cost AI#Dataset Curation#Fine-tuning

[논문리뷰] Tree Search for LLM Agent Reinforcement Learning

Xiangxiang Chu이 [arXiv]에 게시한 'Tree Search for LLM Agent Reinforcement Learning' 논문에 대한 자세한 리뷰입니다.
#Review#LLM Agents#Reinforcement Learning#Tree Search#Policy Optimization#Preference Learning#Sparse Rewards#Multi-turn Tasks

[논문리뷰] Thinking Augmented Pre-training

Furu Wei이 [arXiv]에 게시한 'Thinking Augmented Pre-training' 논문에 대한 자세한 리뷰입니다.
#Review#Large Language Models (LLMs)#Pre-training#Data Augmentation#Reasoning#Data Efficiency#Thinking Trajectories

[논문리뷰] SciReasoner: Laying the Scientific Reasoning Ground Across Disciplines

Jiabei Xiao이 [arXiv]에 게시한 'SciReasoner: Laying the Scientific Reasoning Ground Across Disciplines' 논문에 대한 자세한 리뷰입니다.
#Review#Scientific Reasoning#Foundation Models#Multi-modal Learning#Cross-domain Generalization#Chain-of-Thought#Reinforcement Learning#Scientific Discovery#Molecular Design

[논문리뷰] Residual Off-Policy RL for Finetuning Behavior Cloning Policies

Pieter Abbeel이 [arXiv]에 게시한 'Residual Off-Policy RL for Finetuning Behavior Cloning Policies' 논문에 대한 자세한 리뷰입니다.
#Review#Reinforcement Learning (RL)#Behavior Cloning (BC)#Residual Learning#Off-Policy RL#Robot Manipulation#Real-World Robotics#High-DoF Systems#Sample Efficiency

[논문리뷰] Quantized Visual Geometry Grounded Transformer

Yuqi Li이 [arXiv]에 게시한 'Quantized Visual Geometry Grounded Transformer' 논문에 대한 자세한 리뷰입니다.
#Review#Quantization#Post-Training Quantization#3D Reconstruction#Visual Transformer#Model Compression#Efficient Inference#Hadamard Rotation#Calibration Sampling

[논문리뷰] Interactive Recommendation Agent with Active User Commands

Xueyang Feng이 [arXiv]에 게시한 'Interactive Recommendation Agent with Active User Commands' 논문에 대한 자세한 리뷰입니다.
#Review#Interactive Recommendation#Large Language Models#Multi-Agent System#Natural Language Processing#Knowledge Distillation#User Control

[논문리뷰] AutoIntent: AutoML for Text Classification

Denis Kuznetsov이 [arXiv]에 게시한 'AutoIntent: AutoML for Text Classification' 논문에 대한 자세한 리뷰입니다.
#Review#AutoML#Text Classification#Intent Classification#Transformer Embeddings#Out-of-Scope Detection#Multi-label Classification#Few-shot Learning#Sklearn-like Interface

[논문리뷰] Video models are zero-shot learners and reasoners

rgeirhos이 [arXiv]에 게시한 'Video models are zero-shot learners and reasoners' 논문에 대한 자세한 리뷰입니다.
#Review#Video Models#Zero-shot Learning#Visual Reasoning#Foundation Models#Generative AI#Perception#Manipulation#Modeling

[논문리뷰] SIM-CoT: Supervised Implicit Chain-of-Thought

Yuhang Cao이 [arXiv]에 게시한 'SIM-CoT: Supervised Implicit Chain-of-Thought' 논문에 대한 자세한 리뷰입니다.
#Review#Implicit Reasoning#Chain-of-Thought#LLM#Latent Space#Supervised Learning#Model Stability#Interpretability

[논문리뷰] Logics-Parsing Technical Report

Fan Yang이 [arXiv]에 게시한 'Logics-Parsing Technical Report' 논문에 대한 자세한 리뷰입니다.
#Review#Document Parsing#Large Vision-Language Models (LVLM)#Reinforcement Learning (RL)#Layout Analysis#Reading Order#Supervised Fine-Tuning (SFT)#HTML Annotation#Benchmarking

[논문리뷰] Zero-Shot Multi-Spectral Learning: Reimagining a Generalist Multimodal Gemini 2.5 Model for Remote Sensing Applications

Genady Beryozkin이 [arXiv]에 게시한 'Zero-Shot Multi-Spectral Learning: Reimagining a Generalist Multimodal Gemini 2.5 Model for Remote Sensing Applications' 논문에 대한 자세한 리뷰입니다.
#Review#Remote Sensing#Zero-Shot Learning#Multimodal Models#Multi-spectral Imagery#Gemini 2.5#Prompt Engineering#Land Cover Classification#Pseudo-Image

[논문리뷰] Reinforcement Learning on Pre-Training Data

Evander Yang이 [arXiv]에 게시한 'Reinforcement Learning on Pre-Training Data' 논문에 대한 자세한 리뷰입니다.
#Review#Reinforcement Learning#Pre-training#Large Language Models#Self-supervised Learning#Scaling Laws#Next-segment Reasoning#Reward Modeling

[논문리뷰] OpenGVL - Benchmarking Visual Temporal Progress for Data Curation

Viktor Petrenko이 [arXiv]에 게시한 'OpenGVL - Benchmarking Visual Temporal Progress for Data Curation' 논문에 대한 자세한 리뷰입니다.
#Review#Robotics Data Curation#Visual Temporal Progress#Generative Value Learning (GVL)#Vision-Language Models (VLMs)#Benchmark#Task Progress Prediction#Value-Order Correlation (VOC)

[논문리뷰] MAPO: Mixed Advantage Policy Optimization

Xuankun Rong이 [arXiv]에 게시한 'MAPO: Mixed Advantage Policy Optimization' 논문에 대한 자세한 리뷰입니다.
#Review#Reinforcement Learning#Foundation Models#Policy Optimization#Advantage Function#Trajectory Certainty#Multimodal Reasoning#GRPO

[논문리뷰] Do You Need Proprioceptive States in Visuomotor Policies?

Yushen Liang이 [arXiv]에 게시한 'Do You Need Proprioceptive States in Visuomotor Policies?' 논문에 대한 자세한 리뷰입니다.
#Review#Visuomotor Policies#Spatial Generalization#Imitation Learning#Proprioception#State-free Policies#Robot Manipulation#End-Effector Control#Data Efficiency

[논문리뷰] VaseVQA: Multimodal Agent and Benchmark for Ancient Greek Pottery

Shiya Huang이 [arXiv]에 게시한 'VaseVQA: Multimodal Agent and Benchmark for Ancient Greek Pottery' 논문에 대한 자세한 리뷰입니다.
#Review#Multimodal Large Language Models#Visual Question Answering#Reinforcement Learning#Cultural Heritage#Ancient Greek Pottery#Supervised Fine-Tuning#Benchmark

[논문리뷰] Understanding Embedding Scaling in Collaborative Filtering

Yonghui Yang이 [arXiv]에 게시한 'Understanding Embedding Scaling in Collaborative Filtering' 논문에 대한 자세한 리뷰입니다.
#Review#Collaborative Filtering#Embedding Scaling#Noise Robustness#Recommender Systems#Graph Neural Networks#Self-supervised Learning#Performance Degradation

[논문리뷰] Synthetic bootstrapped pretraining

Emmanuel Candès이 [arXiv]에 게시한 'Synthetic bootstrapped pretraining' 논문에 대한 자세한 리뷰입니다.
#Review#Language Model Pretraining#Synthetic Data#Inter-document Correlation#Data Augmentation#Transformer#Bootstrapping#Concept Learning

[논문리뷰] Qwen3-Omni Technical Report

Lhma-aslp이 [arXiv]에 게시한 'Qwen3-Omni Technical Report' 논문에 대한 자세한 리뷰입니다.
#Review#Multimodal Model#Thinker-Talker Architecture#Mixture-of-Experts#Low-latency#Audio Understanding#Cross-modal Reasoning#State-of-the-Art#Real-time Interaction

[논문리뷰] Mano Report

Minghui Wu이 [arXiv]에 게시한 'Mano Report' 논문에 대한 자세한 리뷰입니다.
#Review#GUI Agent#Multi-modal Foundation Model#Reinforcement Learning#Supervised Fine-tuning#Simulated Environment#Data Generation#Error Recovery#Web Automation

[논문리뷰] LIMI: Less is More for Agency

happyZYM이 [arXiv]에 게시한 'LIMI: Less is More for Agency' 논문에 대한 자세한 리뷰입니다.
#Review#AI Agency#Data Curation#Less Is More#Agentic Intelligence#Foundation Models#Evaluation Benchmark#Efficiency Principle#Large Language Models

[논문리뷰] FlagEval Findings Report: A Preliminary Evaluation of Large Reasoning Models on Automatically Verifiable Textual and Visual Questions

tengdai722이 [arXiv]에 게시한 'FlagEval Findings Report: A Preliminary Evaluation of Large Reasoning Models on Automatically Verifiable Textual and Visual Questions' 논문에 대한 자세한 리뷰입니다.
#Review#Large Reasoning Models#LLM Evaluation#Multimodal AI#Reasoning Behaviors#Hallucination#Contamination-Free#AI Safety#Instruction Following

[논문리뷰] DIWALI - Diversity and Inclusivity aWare cuLture specific Items for India: Dataset and Assessment of LLMs for Cultural Text Adaptation in Indian Context

Maunendra Sankar Desarkar이 [arXiv]에 게시한 'DIWALI - Diversity and Inclusivity aWare cuLture specific Items for India: Dataset and Assessment of LLMs for Cultural Text Adaptation in Indian Context' 논문에 대한 자세한 리뷰입니다.
#Review#Cultural Adaptation#Large Language Models#Indian Culture#Dataset Creation#CSI#Human Evaluation#LLM Evaluation#Cultural Bias

[논문리뷰] Cross-Attention is Half Explanation in Speech-to-Text Models

Luisa Bentivogli이 [arXiv]에 게시한 'Cross-Attention is Half Explanation in Speech-to-Text Models' 논문에 대한 자세한 리뷰입니다.
#Review#Cross-attention#Speech-to-Text (S2T)#Explainable AI (XAI)#Saliency Maps#Feature Attribution#Transformer#Context Mixing#Correlation

[논문리뷰] ARE: Scaling Up Agent Environments and Evaluations

Matteo Bettini이 [arXiv]에 게시한 'ARE: Scaling Up Agent Environments and Evaluations' 논문에 대한 자세한 리뷰입니다.
#Review#Agent Environments#Agent Evaluation#LLM Agents#Asynchronous Systems#Reinforcement Learning#Tool Use#Multi-agent Collaboration#Benchmark

[논문리뷰] SPATIALGEN: Layout-guided 3D Indoor Scene Generation

Yongsen Mao이 [arXiv]에 게시한 'SPATIALGEN: Layout-guided 3D Indoor Scene Generation' 논문에 대한 자세한 리뷰입니다.
#Review#3D Scene Generation#Layout Guidance#Diffusion Models#Multi-view Synthesis#Synthetic Dataset#Indoor Environments#Gaussian Splatting#Semantic Consistency

[논문리뷰] RGB-Only Supervised Camera Parameter Optimization in Dynamic Scenes

Narendra Ahuja이 [arXiv]에 게시한 'RGB-Only Supervised Camera Parameter Optimization in Dynamic Scenes' 논문에 대한 자세한 리뷰입니다.
#Review#Camera Parameter Optimization#Dynamic Scenes#RGB-Only Supervision#Structure from Motion#Outlier Robustness#3D Gaussian Splatting#Two-stage Optimization#Point Tracking

[논문리뷰] Lynx: Towards High-Fidelity Personalized Video Generation

Linjie Luo이 [arXiv]에 게시한 'Lynx: Towards High-Fidelity Personalized Video Generation' 논문에 대한 자세한 리뷰입니다.
#Review#Personalized Video Generation#Diffusion Transformer#Identity Preservation#Video Synthesis#Adapter Networks#Facial Recognition#Cross-Attention

[논문리뷰] Do You Hear What I Mean? Quantifying the Instruction-Perception Gap in Instruction-Guided Expressive Text-To-Speech Systems

Hung-yi Lee이 [arXiv]에 게시한 'Do You Hear What I Mean? Quantifying the Instruction-Perception Gap in Instruction-Guided Expressive Text-To-Speech Systems' 논문에 대한 자세한 리뷰입니다.
#Review#Instruction-Guided TTS#Expressive Speech Synthesis#Human Perception#Subjective Evaluation#Controllability#Instruction Following#Evaluation Metrics

[논문리뷰] BTL-UI: Blink-Think-Link Reasoning Model for GUI Agent

Jiahui Yang이 [arXiv]에 게시한 'BTL-UI: Blink-Think-Link Reasoning Model for GUI Agent' 논문에 대한 자세한 리뷰입니다.
#Review#GUI Agent#Human-GUI Interaction#Cognitive Modeling#Reinforcement Learning#Multimodal Large Language Models#Attention Mechanisms#Action Planning

[논문리뷰] Unleashing the Potential of Multimodal LLMs for Zero-Shot Spatio-Temporal Video Grounding

Rynson W. H. Lau이 [arXiv]에 게시한 'Unleashing the Potential of Multimodal LLMs for Zero-Shot Spatio-Temporal Video Grounding' 논문에 대한 자세한 리뷰입니다.
#Review#Spatio-Temporal Video Grounding#Multimodal Large Language Models#Zero-Shot Learning#Visual Grounding#Decomposed Spatio-Temporal Highlighting#Logit-Guided Re-attention#Temporal-Augmented Assembling

[논문리뷰] RynnVLA-001: Using Human Demonstrations to Improve Robot Manipulation

SpaceProduct이 [arXiv]에 게시한 'RynnVLA-001: Using Human Demonstrations to Improve Robot Manipulation' 논문에 대한 자세한 리뷰입니다.
#Review#Vision-Language-Action (VLA) Model#Robot Manipulation#Human Demonstrations#Video Generative Pretraining#Ego-Centric Video#Trajectory Prediction#ActionVAE#Transformer

[논문리뷰] RecoWorld: Building Simulated Environments for Agentic Recommender Systems

Mingyuan Wu이 [arXiv]에 게시한 'RecoWorld: Building Simulated Environments for Agentic Recommender Systems' 논문에 대한 자세한 리뷰입니다.
#Review#Agentic Recommender Systems#Simulated Environments#LLM-driven Simulation#Multi-turn Interaction#Reinforcement Learning#User Retention#Instruction Following#Multi-agent Systems

[논문리뷰] FlowRL: Matching Reward Distributions for LLM Reasoning

Hengli Li이 [arXiv]에 게시한 'FlowRL: Matching Reward Distributions for LLM Reasoning' 논문에 대한 자세한 리뷰입니다.
#Review#Reinforcement Learning#Large Language Models#Reward Distribution Matching#GFlowNets#Mode Collapse#Diverse Reasoning#Flow-Balanced Optimization

[논문리뷰] AToken: A Unified Tokenizer for Vision

Mingze Xu이 [arXiv]에 게시한 'AToken: A Unified Tokenizer for Vision' 논문에 대한 자세한 리뷰입니다.
#Review#Unified Visual Tokenizer#Multimodal AI#Transformer Architecture#4D Representation#Adversarial-free Training#Reconstruction#Semantic Understanding#Generative Models

[논문리뷰] SAIL-VL2 Technical Report

Zijian Kang이 [arXiv]에 게시한 'SAIL-VL2 Technical Report' 논문에 대한 자세한 리뷰입니다.
#Review#Vision-Language Model#Multimodal Understanding#Mixture-of-Experts#Progressive Training#Data Curation#Supervised Fine-tuning#Reinforcement Learning#SAIL-ViT

[논문리뷰] PANORAMA: The Rise of Omnidirectional Vision in the Embodied AI Era

Zihao Dongfang이 [arXiv]에 게시한 'PANORAMA: The Rise of Omnidirectional Vision in the Embodied AI Era' 논문에 대한 자세한 리뷰입니다.
#Review#Omnidirectional Vision#Embodied AI#Panoramic Perception#Multi-modal Learning#Dataset Development#Robot Navigation#Spatial Reasoning#System Architecture

[논문리뷰] MARS2 2025 Challenge on Multimodal Reasoning: Datasets, Methods, Results, Discussion, and Outlook

Bowen Zhou이 [arXiv]에 게시한 'MARS2 2025 Challenge on Multimodal Reasoning: Datasets, Methods, Results, Discussion, and Outlook' 논문에 대한 자세한 리뷰입니다.
#Review#Multimodal Reasoning#Large Language Models (LLMs)#Multimodal Large Language Models (MLLMs)#Visual Grounding#Visual Question Answering#Advertisement Video Analysis#Real-world Scenarios#Challenge Benchmark

[논문리뷰] Improving Context Fidelity via Native Retrieval-Augmented Reasoning

Xiangru Tang이 [arXiv]에 게시한 'Improving Context Fidelity via Native Retrieval-Augmented Reasoning' 논문에 대한 자세한 리뷰입니다.
#Review#Context Fidelity#Retrieval-Augmented Generation (RAG)#Large Language Models (LLMs)#Reinforcement Learning (RL)#Supervised Fine-Tuning (SFT)#Hallucination#Question Answering#In-context Retrieval#Curriculum Learning

[논문리뷰] GenExam: A Multidisciplinary Text-to-Image Exam

Yu Qiao이 [arXiv]에 게시한 'GenExam: A Multidisciplinary Text-to-Image Exam' 논문에 대한 자세한 리뷰입니다.
#Review#Text-to-Image Generation#Multidisciplinary#Benchmark#Evaluation#AGI#Reasoning#Scoring System#Visual Question Answering

[논문리뷰] Single-stream Policy Optimization

Zihan Ding이 [arXiv]에 게시한 'Single-stream Policy Optimization' 논문에 대한 자세한 리뷰입니다.
#Review#Reinforcement Learning#LLM Optimization#Policy Gradient#Variance Reduction#Adaptive Sampling#Scalability#Agentic Systems#RLVR

[논문리뷰] Scaling Agents via Continual Pre-training

Guangyu Li이 [arXiv]에 게시한 'Scaling Agents via Continual Pre-training' 논문에 대한 자세한 리뷰입니다.
#Review#Agentic LLMs#Continual Pre-training#Deep Research Agents#Tool Use#Multi-step Reasoning#Data Synthesis#Scaling Laws

[논문리뷰] Multimodal Reasoning for Science: Technical Report and 1st Place Solution to the ICML 2025 SeePhys Challenge

Wentao Zhang이 [arXiv]에 게시한 'Multimodal Reasoning for Science: Technical Report and 1st Place Solution to the ICML 2025 SeePhys Challenge' 논문에 대한 자세한 리뷰입니다.
#Review#Multimodal Reasoning#Science AI#Caption-assisted Reasoning#SeePhys Challenge#Large Language Models#Visual Question Answering#Physics Problems#Cross-modal Alignment

[논문리뷰] Exact Coset Sampling for Quantum Lattice Algorithms

Yifan Zhang이 [arXiv]에 게시한 'Exact Coset Sampling for Quantum Lattice Algorithms' 논문에 대한 자세한 리뷰입니다.
#Review#Quantum Algorithms#Lattice Problems#Coset Sampling#Quantum Fourier Transform (QFT)#Modular Arithmetic#Quantum Cryptography#Exact Sampling

[논문리뷰] 3D Aware Region Prompted Vision Language Model

Xiaolong Li이 [arXiv]에 게시한 '3D Aware Region Prompted Vision Language Model' 논문에 대한 자세한 리뷰입니다.
#Review#3D Vision#Vision-Language Models#Spatial Reasoning#Region Prompting#Multi-view Learning#Depth Estimation#Unified Representation#Generative AI

[논문리뷰] PersonaX: Multimodal Datasets with LLM-Inferred Behavior Traits

Zhenhao Chen이 [arXiv]에 게시한 'PersonaX: Multimodal Datasets with LLM-Inferred Behavior Traits' 논문에 대한 자세한 리뷰입니다.
#Review#Multimodal Dataset#LLM Inference#Behavioral Traits#Causal Representation Learning#Big Five#Multimodal AI#Causal Discovery#Human-Computer Interaction

[논문리뷰] Dr.V: A Hierarchical Perception-Temporal-Cognition Framework to Diagnose Video Hallucination by Fine-grained Spatial-Temporal Grounding

Li Zheng이 [arXiv]에 게시한 'Dr.V: A Hierarchical Perception-Temporal-Cognition Framework to Diagnose Video Hallucination by Fine-grained Spatial-Temporal Grounding' 논문에 대한 자세한 리뷰입니다.
#Review#Video Hallucination#Large Video Models (LVMs)#Hierarchical Reasoning#Spatial-Temporal Grounding#Diagnostic Framework#Benchmark Dataset#Multimodal AI

[논문리뷰] X-Part: high fidelity and structure coherent shape decomposition

Yunhan Yang이 [arXiv]에 게시한 'X-Part: high fidelity and structure coherent shape decomposition' 논문에 대한 자세한 리뷰입니다.
#Review#3D Shape Decomposition#Diffusion Models#Part-level Generation#Controllable Generation#Bounding Box Prompts#Semantic Features#Interactive Editing#Generative AI

[논문리뷰] Virtual Agent Economies

William A. Cunningham이 [arXiv]에 게시한 'Virtual Agent Economies' 논문에 대한 자세한 리뷰입니다.
#Review#AI Agents#Virtual Economy#Multi-Agent Systems#Economic Mechanisms#Governance#Blockchain#Resource Allocation#Agent Alignment

[논문리뷰] Visual Programmability: A Guide for Code-as-Thought in Chart Understanding

Ethan Chern이 [arXiv]에 게시한 'Visual Programmability: A Guide for Code-as-Thought in Chart Understanding' 논문에 대한 자세한 리뷰입니다.
#Review#Visual Programmability#Code-as-Thought (CaT)#Chart Understanding#Vision-Language Models (VLMs)#Reinforcement Learning (RL)#Adaptive Reasoning#Dual-Reward System#Multimodal AI

[논문리뷰] The Choice of Divergence: A Neglected Key to Mitigating Diversity Collapse in Reinforcement Learning with Verifiable Reward

Xiaoyu Tan이 [arXiv]에 게시한 'The Choice of Divergence: A Neglected Key to Mitigating Diversity Collapse in Reinforcement Learning with Verifiable Reward' 논문에 대한 자세한 리뷰입니다.
#Review#Reinforcement Learning#Large Language Models (LLMs)#Diversity Collapse#f-divergence#Forward-KL#JS-divergence#Pass@k#Catastrophic Forgetting

[논문리뷰] SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning

Zhaohui Yang이 [arXiv]에 게시한 'SimpleVLA-RL: Scaling VLA Training via Reinforcement Learning' 논문에 대한 자세한 리뷰입니다.
#Review#Reinforcement Learning (RL)#Vision-Language-Action (VLA) Models#Robotic Manipulation#Data Scarcity#Generalization#Sim-to-Real Transfer#Online RL#Long-Horizon Planning

[논문리뷰] RewardDance: Reward Scaling in Visual Generation

Liang Li이 [arXiv]에 게시한 'RewardDance: Reward Scaling in Visual Generation' 논문에 대한 자세한 리뷰입니다.
#Review#Reward Model#Visual Generation#RLHF#VLM#Reward Scaling#Reward Hacking#Generative Paradigm#Context Scaling#Text-to-Image#Text-to-Video

[논문리뷰] P3-SAM: Native 3D Part Segmentation

Yunhan Yang이 [arXiv]에 게시한 'P3-SAM: Native 3D Part Segmentation' 논문에 대한 자세한 리뷰입니다.
#Review#3D Part Segmentation#Point Cloud Segmentation#Prompt-based Segmentation#Deep Learning#Transformer#Interactive Segmentation#Automatic Segmentation#Native 3D

[논문리뷰] Hunyuan-MT Technical Report

Yang Du이 [arXiv]에 게시한 'Hunyuan-MT Technical Report' 논문에 대한 자세한 리뷰입니다.
#Review#Machine Translation#Large Language Model#Multilingual#Low-Resource Languages#Reinforcement Learning#Weak-to-Strong Learning#Slow Thinking

[논문리뷰] EnvX: Agentize Everything with Agentic AI

Wenzheng Tom Tang이 [arXiv]에 게시한 'EnvX: Agentize Everything with Agentic AI' 논문에 대한 자세한 리뷰입니다.
#Review#Agentic AI#Multi-Agent Systems#Code Repository#Agentization#Natural Language Interaction#Agent-to-Agent Protocol#LLM-based Agents

[논문리뷰] 3D and 4D World Modeling: A Survey

Ao Liang이 [arXiv]에 게시한 '3D and 4D World Modeling: A Survey' 논문에 대한 자세한 리뷰입니다.
#Review#3D World Modeling#4D World Modeling#Generative Models#Predictive Models#LiDAR#Occupancy Grids#Video Generation#Autonomous Driving#Robotics

[논문리뷰] ΔL Normalization: Rethink Loss Aggregation in RLVR

Lili Qiu이 [arXiv]에 게시한 'ΔL Normalization: Rethink Loss Aggregation in RLVR' 논문에 대한 자세한 리뷰입니다.
#Review#Reinforcement Learning#LLMs#Gradient Variance#Loss Aggregation#Unbiased Estimator#RLVR#Policy Gradient#Normalization

[논문리뷰] Visual Representation Alignment for Multimodal Large Language Models

Heeseong Shin이 [arXiv]에 게시한 'Visual Representation Alignment for Multimodal Large Language Models' 논문에 대한 자세한 리뷰입니다.
#Review#Multimodal LLMs#Visual Representation Alignment#Foundation Models#Regularization#Fine-grained Visual Understanding#Spatial Reasoning#Object Counting#Vision-Language Models

[논문리뷰] Reconstruction Alignment Improves Unified Multimodal Models

XuDong Wang이 [arXiv]에 게시한 'Reconstruction Alignment Improves Unified Multimodal Models' 논문에 대한 자세한 리뷰입니다.
#Review#Unified Multimodal Models#Image Generation#Image Editing#Post-training#Self-supervised Learning#Reconstruction Alignment#Visual Embeddings

[논문리뷰] Language Self-Play For Data-Free Training

Vijai Mohan이 [arXiv]에 게시한 'Language Self-Play For Data-Free Training' 논문에 대한 자세한 리뷰입니다.
#Review#Large Language Models#Reinforcement Learning#Self-Play#Data-Free Training#Instruction Following#Adversarial Training#Reward Modeling

[논문리뷰] Curia: A Multi-Modal Foundation Model for Radiology

Elodie Ferreres이 [arXiv]에 게시한 'Curia: A Multi-Modal Foundation Model for Radiology' 논문에 대한 자세한 리뷰입니다.
#Review#Foundation Model#Radiology#Computed Tomography (CT)#Magnetic Resonance Imaging (MRI)#Self-supervised Learning#Vision Transformer#Cross-Modality Generalization

[논문리뷰] Causal Attention with Lookahead Keys

Quanquan Gu이 [arXiv]에 게시한 'Causal Attention with Lookahead Keys' 논문에 대한 자세한 리뷰입니다.
#Review#Causal Attention#Lookahead Keys#Autoregressive Modeling#Language Models#Transformer#Perplexity Reduction#Parallel Training#Efficient Inference

[논문리뷰] WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents

Aili Chen이 [arXiv]에 게시한 'WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents' 논문에 대한 자세한 리뷰입니다.
#Review#Web Agents#Long-Horizon Reasoning#Large Language Models (LLMs)#Data Generation#Reinforcement Learning (RL)#Supervised Fine-tuning (SFT)#Web Navigation#Information Retrieval

[논문리뷰] UniVerse-1: Unified Audio-Video Generation via Stitching of Experts

Xinyao Liao이 [arXiv]에 게시한 'UniVerse-1: Unified Audio-Video Generation via Stitching of Experts' 논문에 대한 자세한 리뷰입니다.
#Review#Unified Audio-Video Generation#Stitching of Experts (SoE)#Multimodal Diffusion#Online Annotation#Cross-modal Noise Correlation#Foundation Models#Verse-Bench

[논문리뷰] Reverse-Engineered Reasoning for Open-Ended Generation

Wangchunshu Zhou이 [arXiv]에 게시한 'Reverse-Engineered Reasoning for Open-Ended Generation' 논문에 대한 자세한 리뷰입니다.
#Review#Deep Reasoning#Open-Ended Generation#Reverse-Engineered Reasoning (REER)#LLMs#Synthetic Data#Iterative Refinement#Perplexity Minimization#DeepWriting-20K

[논문리뷰] Reinforced Visual Perception with Tools

Mingyang Fu이 [arXiv]에 게시한 'Reinforced Visual Perception with Tools' 논문에 대한 자세한 리뷰입니다.
#Review#Visual Reasoning#Multimodal LLMs#Reinforcement Learning#Tool Usage#Perception-heavy Benchmarks#GRPO#Vision Tools

[논문리뷰] Interleaving Reasoning for Better Text-to-Image Generation

Shixiang Tang이 [arXiv]에 게시한 'Interleaving Reasoning for Better Text-to-Image Generation' 논문에 대한 자세한 리뷰입니다.
#Review#Text-to-Image Generation#Interleaving Reasoning#Multimodal Learning#Visual Quality#Fine-grained Detail#Diffusion Models#Self-Correction

[논문리뷰] Does DINOv3 Set a New Medical Vision Standard?

Bailiang Jian이 [arXiv]에 게시한 'Does DINOv3 Set a New Medical Vision Standard?' 논문에 대한 자세한 리뷰입니다.
#Review#Medical Imaging#Foundation Models#DINOv3#Self-Supervised Learning#Vision Transformer#2D/3D Classification#Segmentation#Domain Adaptation#Scaling Laws

[논문리뷰] D-HUMOR: Dark Humor Understanding via Multimodal Open-ended Reasoning

Dhanvin Sanjay Namboodiri이 [arXiv]에 게시한 'D-HUMOR: Dark Humor Understanding via Multimodal Open-ended Reasoning' 논문에 대한 자세한 리뷰입니다.
#Review#Dark Humor Detection#Multimodal Reasoning#Vision-Language Models (VLMs)#Iterative Reasoning Refinement#Meme Analysis#Content Moderation#Cross-Modal Attention#Dataset Annotation

[논문리뷰] Why Language Models Hallucinate

Edwin Zhang이 [arXiv]에 게시한 'Why Language Models Hallucinate' 논문에 대한 자세한 리뷰입니다.
#Review#Language Models#Hallucination#Pretraining#Post-training#Evaluation Metrics#Binary Classification#Uncertainty Quantification#Calibration

[논문리뷰] Symbolic Graphics Programming with Large Language Models

Kaipeng Zhang이 [arXiv]에 게시한 'Symbolic Graphics Programming with Large Language Models' 논문에 대한 자세한 리뷰입니다.
#Review#Symbolic Graphics Programming#Large Language Models#Reinforcement Learning#SVG Generation#Text-to-Image Synthesis#Cross-Modal Alignment#Program Synthesis

[논문리뷰] Set Block Decoding is a Language Model Inference Accelerator

Jeremy Reizenstein이 [arXiv]에 게시한 'Set Block Decoding is a Language Model Inference Accelerator' 논문에 대한 자세한 리뷰입니다.
#Review#Language Model Inference#Acceleration#Set Block Decoding#Next Token Prediction#Masked Token Prediction#Parallel Decoding#KV-caching#Diffusion Models

[논문리뷰] MedVista3D: Vision-Language Modeling for Reducing Diagnostic Errors in 3D CT Disease Detection, Understanding and Reporting

Vanessa Wildman이 [arXiv]에 게시한 'MedVista3D: Vision-Language Modeling for Reducing Diagnostic Errors in 3D CT Disease Detection, Understanding and Reporting' 논문에 대한 자세한 리뷰입니다.
#Review#3D CT#Vision-Language Model#Medical Imaging#Diagnostic Error Reduction#Multi-scale Alignment#Semantic Enrichment#Radiology Reporting#Zero-shot Learning

[논문리뷰] Bootstrapping Task Spaces for Self-Improvement

Yoram Bachrach이 [arXiv]에 게시한 'Bootstrapping Task Spaces for Self-Improvement' 논문에 대한 자세한 리뷰입니다.
#Review#Reinforcement Learning (RL)#Large Language Models (LLMs)#Self-Improvement#Autocurriculum#Task-Space Exploration#Inference-Time Iteration#Policy Optimization

[논문리뷰] Behavioral Fingerprinting of Large Language Models

Xing Li이 [arXiv]에 게시한 'Behavioral Fingerprinting of Large Language Models' 논문에 대한 자세한 리뷰입니다.
#Review#Large Language Models#Behavioral Evaluation#Model Alignment#Sycophancy#World Model Brittleness#Metacognition#Personality Profiling

[논문리뷰] Transition Models: Rethinking the Generative Learning Objective

Yangguang Li이 [arXiv]에 게시한 'Transition Models: Rethinking the Generative Learning Objective' 논문에 대한 자세한 리뷰입니다.
#Review#Generative Models#Diffusion Models#Training Objective#Continuous-Time Dynamics#State Transition#Few-Step Generation#Scalable Training#Image Generation

[논문리뷰] Towards a Unified View of Large Language Model Post-Training

Hongyi Liu이 [arXiv]에 게시한 'Towards a Unified View of Large Language Model Post-Training' 논문에 대한 자세한 리뷰입니다.
#Review#Large Language Models (LLMs)#Post-Training#Reinforcement Learning (RL)#Supervised Fine-Tuning (SFT)#Policy Gradient#Unified Framework#Hybrid Algorithms#Bias-Variance Tradeoff

[논문리뷰] From Editor to Dense Geometry Estimator

Lang Nie이 [arXiv]에 게시한 'From Editor to Dense Geometry Estimator' 논문에 대한 자세한 리뷰입니다.
#Review#Dense Geometry Estimation#Diffusion Transformer#Image Editing#Zero-shot Learning#Depth Estimation#Normal Estimation#Flow Matching#Logarithmic Quantization

[논문리뷰] Robix: A Unified Model for Robot Interaction, Reasoning and Planning

Zixuan Wang이 [arXiv]에 게시한 'Robix: A Unified Model for Robot Interaction, Reasoning and Planning' 논문에 대한 자세한 리뷰입니다.
#Review#Robot Learning#Vision-Language Models (VLMs)#Embodied AI#Human-Robot Interaction (HRI)#Task Planning#Reinforcement Learning (RL)#Chain-of-Thought (CoT) Reasoning#Robotics

[논문리뷰] Open Data Synthesis For Deep Research

Zheng Liu이 [arXiv]에 게시한 'Open Data Synthesis For Deep Research' 논문에 대한 자세한 리뷰입니다.
#Review#Data Synthesis#Deep Research#Hierarchical Constraint Satisfaction Problems#Large Language Models#Agentic AI#Reinforcement Learning#Question Answering

[논문리뷰] Universal Deep Research: Bring Your Own Model and Strategy

Pavlo Molchanov이 [arXiv]에 게시한 'Universal Deep Research: Bring Your Own Model and Strategy' 논문에 대한 자세한 리뷰입니다.
#Review#Agentic Systems#Language Models (LLMs)#Research Automation#Customizable Strategies#Code Generation#Deep Research#User-Defined Agents#Sandboxed Execution

[논문리뷰] Towards More Diverse and Challenging Pre-training for Point Cloud Learning: Self-Supervised Cross Reconstruction with Decoupled Views

Junchi Yan이 [arXiv]에 게시한 'Towards More Diverse and Challenging Pre-training for Point Cloud Learning: Self-Supervised Cross Reconstruction with Decoupled Views' 논문에 대한 자세한 리뷰입니다.
#Review#Point Cloud Learning#Self-Supervised Learning#Cross Reconstruction#Decoupled Views#Generative Models#Positional Encoding#3D Vision

[논문리뷰] The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Hejia Geng이 [arXiv]에 게시한 'The Landscape of Agentic Reinforcement Learning for LLMs: A Survey' 논문에 대한 자세한 리뷰입니다.
#Review#Agentic Reinforcement Learning#Large Language Models#LLM Agents#Sequential Decision Making#Policy Optimization#Tool Use#Dynamic Environments#Autonomous AI

[논문리뷰] LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model

Jianwei Yang이 [arXiv]에 게시한 'LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model' 논문에 대한 자세한 리뷰입니다.
#Review#Vision-Language Models (VLMs)#Critic Models#Policy Models#Reinforcement Learning (RL)#Self-Criticism#Multimodal Reasoning#Preference Learning#Generative Models

[논문리뷰] Kwai Keye-VL 1.5 Technical Report

SXxtyz이 [arXiv]에 게시한 'Kwai Keye-VL 1.5 Technical Report' 논문에 대한 자세한 리뷰입니다.
#Review#Multimodal LLMs#Video Understanding#Slow-Fast Encoding#Long Context#Chain-of-Thought#Reinforcement Learning#Human Alignment#Native-Resolution Vision Encoder

[논문리뷰] Fantastic Pretraining Optimizers and Where to Find Them

Percy Liang이 [arXiv]에 게시한 'Fantastic Pretraining Optimizers and Where to Find Them' 논문에 대한 자세한 리뷰입니다.
#Review#Deep Learning Optimizers#Large Language Models#Hyperparameter Tuning#Pretraining Speedup#Scaling Laws#AdamW#Matrix-based Optimizers#Data-to-Model Ratio

[논문리뷰] DCPO: Dynamic Clipping Policy Optimization

Kai Lu이 [arXiv]에 게시한 'DCPO: Dynamic Clipping Policy Optimization' 논문에 대한 자세한 리뷰입니다.
#Review#Reinforcement Learning#LLM#Policy Optimization#Dynamic Clipping#Advantage Standardization#RLVR#Reasoning

[논문리뷰] AMBEDKAR-A Multi-level Bias Elimination through a Decoding Approach with Knowledge Augmentation for Robust Constitutional Alignment of Language Models

Rahul Karthikeyan이 [arXiv]에 게시한 'AMBEDKAR-A Multi-level Bias Elimination through a Decoding Approach with Knowledge Augmentation for Robust Constitutional Alignment of Language Models' 논문에 대한 자세한 리뷰입니다.
#Review#Bias Mitigation#Large Language Models#Speculative Decoding#Constitutional AI#Fairness#Inference-Time Control#Indian Sociocultural Context

[논문리뷰] R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning

Han Hu이 [arXiv]에 게시한 'R-4B: Incentivizing General-Purpose Auto-Thinking Capability in MLLMs via Bi-Mode Annealing and Reinforce Learning' 논문에 대한 자세한 리뷰입니다.
#Review#Multimodal Large Language Models (MLLMs)#Auto-Thinking#Reinforcement Learning (RL)#Bi-mode Annealing#Bi-mode Policy Optimization (BPO)#General-Purpose AI#Reasoning#Efficiency

[논문리뷰] Morae: Proactively Pausing UI Agents for User Choices

Amy Pavel이 [arXiv]에 게시한 'Morae: Proactively Pausing UI Agents for User Choices' 논문에 대한 자세한 리뷰입니다.
#Review#UI Agents#Accessibility#Human-Agent Interaction#Mixed-Initiative AI#Large Multimodal Models#Proactive AI#User Choice#Blind and Low-Vision Users

[논문리뷰] Efficient Code Embeddings from Code Generation Models

Han Xiao이 [arXiv]에 게시한 'Efficient Code Embeddings from Code Generation Models' 논문에 대한 자세한 리뷰입니다.
#Review#Code Embeddings#Code Generation Models#Autoregressive Backbones#Last-Token Pooling#Instruction Tuning#Contrastive Learning#Retrieval-Augmented Generation#MTEB Benchmark

[논문리뷰] CLIPSym: Delving into Symmetry Detection with CLIP

Raymond A. Yeh이 [arXiv]에 게시한 'CLIPSym: Delving into Symmetry Detection with CLIP' 논문에 대한 자세한 리뷰입니다.
#Review#Symmetry Detection#Vision-Language Models#CLIP#Equivariant Networks#Prompt Engineering#Geometric Deep Learning

[논문리뷰] AHELM: A Holistic Evaluation of Audio-Language Models

Siwei Yang이 [arXiv]에 게시한 'AHELM: A Holistic Evaluation of Audio-Language Models' 논문에 대한 자세한 리뷰입니다.
#Review#Audio-Language Models#Holistic Evaluation#Benchmarking#Multimodality#Fairness#Robustness#Reasoning#Bias Detection

[논문리뷰] rStar2-Agent: Agentic Reasoning Technical Report

Weijiang Xu이 [arXiv]에 게시한 'rStar2-Agent: Agentic Reasoning Technical Report' 논문에 대한 자세한 리뷰입니다.
#Review#Agentic Reinforcement Learning#Math Reasoning#Code Interpreter#Tool Use#GRPO-RoC#LLM Training Efficiency#Self-Reflection

[논문리뷰] ROSE: Remove Objects with Side Effects in Videos

Hantang Liu이 [arXiv]에 게시한 'ROSE: Remove Objects with Side Effects in Videos' 논문에 대한 자세한 리뷰입니다.
#Review#Video Object Removal#Side Effects#3D Rendering#Diffusion Transformer#Video Inpainting#Synthetic Data#Difference Mask

[논문리뷰] Provable Benefits of In-Tool Learning for Large Language Models

Vivien Cabannes이 [arXiv]에 게시한 'Provable Benefits of In-Tool Learning for Large Language Models' 논문에 대한 자세한 리뷰입니다.
#Review#Large Language Models#In-Tool Learning#In-Weight Learning#Factual Recall#Retrieval-Augmented Generation#Scaling Laws#Parameter Efficiency#Catastrophic Forgetting

[논문리뷰] Multi-View 3D Point Tracking

Irem Demir이 [arXiv]에 게시한 'Multi-View 3D Point Tracking' 논문에 대한 자세한 리뷰입니다.
#Review#3D Point Tracking#Multi-View#Transformer#kNN Correlation#Depth Estimation#Dynamic Scenes#Occlusion Handling#Feature Fusion

[논문리뷰] Mixture of Contexts for Long Video Generation

Junfei Xiao이 [arXiv]에 게시한 'Mixture of Contexts for Long Video Generation' 논문에 대한 자세한 리뷰입니다.
#Review#Long Video Generation#Diffusion Transformers (DiT)#Sparse Attention#Context Routing#Memory Management#Generative Models#Video Synthesis

[논문리뷰] FakeParts: a New Family of AI-Generated DeepFakes

Xi Wang이 [arXiv]에 게시한 'FakeParts: a New Family of AI-Generated DeepFakes' 논문에 대한 자세한 리뷰입니다.
#Review#Deepfake Detection#Partial Deepfakes#AI-Generated Video#Benchmark Dataset#Video Forensics#Generative Models#Manipulation Detection#Human Perception

[논문리뷰] AWorld: Orchestrating the Training Recipe for Agentic AI

Qintong Wu이 [arXiv]에 게시한 'AWorld: Orchestrating the Training Recipe for Agentic AI' 논문에 대한 자세한 리뷰입니다.
#Review#Agentic AI#Reinforcement Learning#Distributed Systems#Experience Generation#LLM Fine-tuning#GAIA Benchmark#Scalability#AWORLD Framework

[논문리뷰] StepWiser: Stepwise Generative Judges for Wiser Reasoning

Olga Golovneva이 [arXiv]에 게시한 'StepWiser: Stepwise Generative Judges for Wiser Reasoning' 논문에 대한 자세한 리뷰입니다.
#Review#LLM Reasoning#Process Reward Models#Reinforcement Learning#Generative Judges#Stepwise Feedback#Chain-of-Thought#Meta-Reasoning

[논문리뷰] Self-Rewarding Vision-Language Model via Reasoning Decomposition

Zhenwen Liang이 [arXiv]에 게시한 'Self-Rewarding Vision-Language Model via Reasoning Decomposition' 논문에 대한 자세한 리뷰입니다.
#Review#Vision-Language Models#Reinforcement Learning#Self-Rewarding#Reasoning Decomposition#Visual Perception#Language Reasoning#Hallucinations#Language Shortcuts

[논문리뷰] Predicting the Order of Upcoming Tokens Improves Language Modeling

Alham Fikri Aji이 [arXiv]에 게시한 'Predicting the Order of Upcoming Tokens Improves Language Modeling' 논문에 대한 자세한 리뷰입니다.
#Review#Language Modeling#Next-Token Prediction#Multi-Token Prediction#Token Order Prediction#Auxiliary Objective#Learning-to-Rank#Transformer#Large Language Models

[논문리뷰] Diffusion Language Models Know the Answer Before Decoding

Shilin Yan이 [arXiv]에 게시한 'Diffusion Language Models Know the Answer Before Decoding' 논문에 대한 자세한 리뷰입니다.
#Review#Diffusion Language Models#DLM Acceleration#Early Answer Convergence#Early Commit Decoding#Confidence Gap#Inference Speedup#Training-Free

[논문리뷰] CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning

Jianze Liang이 [arXiv]에 게시한 'CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning' 논문에 대한 자세한 리뷰입니다.
#Review#GUI Agents#Reinforcement Learning#Planner-Executor Architecture#Decoupled Training#Large Vision-Language Models#Specialization#Generalization#Computer Use Agent

[논문리뷰] Beyond Transcription: Mechanistic Interpretability in ASR

Aviv Shamsian이 [arXiv]에 게시한 'Beyond Transcription: Mechanistic Interpretability in ASR' 논문에 대한 자세한 리뷰입니다.
#Review#ASR#Mechanistic Interpretability#Logit Lens#Linear Probing#Activation Patching#Hallucinations#Repetitions#Encoder-Decoder

[논문리뷰] Wan-S2V: Audio-Driven Cinematic Video Generation

Chaonan Ji이 [arXiv]에 게시한 'Wan-S2V: Audio-Driven Cinematic Video Generation' 논문에 대한 자세한 리뷰입니다.
#Review#Audio-Driven Video Generation#Cinematic Video#Diffusion Models#Transformer Architecture#Long Video Consistency#Human Animation#Multimodal Control#Data Curation

[논문리뷰] VibeVoice Technical Report

Yaoyao Chang이 [arXiv]에 게시한 'VibeVoice Technical Report' 논문에 대한 자세한 리뷰입니다.
#Review#Speech Synthesis#Long-form Audio#Multi-speaker#Next-token Diffusion#Speech Tokenizer#Large Language Model#Variational Autoencoder#Audio Compression

[논문리뷰] TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling

Zhoufutu Wen이 [arXiv]에 게시한 'TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling' 논문에 대한 자세한 리뷰입니다.
#Review#Reinforcement Learning#Policy Optimization#Large Language Models#Inference Efficiency#Tree Search#Segment-level Decoding#Advantage Estimation#Reasoning

[논문리뷰] Spacer: Towards Engineered Scientific Inspiration

zerojun48이 [arXiv]에 게시한 'Spacer: Towards Engineered Scientific Inspiration' 논문에 대한 자세한 리뷰입니다.
#Review#Scientific Discovery#Large Language Models (LLMs)#Decontextualization#Keyword Graph#Multi-Agent System#Scientific Ideation#Research Automation#Inspiration Engine

[논문리뷰] MovieCORE: COgnitive REasoning in Movies

Hung-Ting Su이 [arXiv]에 게시한 'MovieCORE: COgnitive REasoning in Movies' 논문에 대한 자세한 리뷰입니다.
#Review#Video Question Answering (VQA)#Cognitive Reasoning#System-2 Thinking#Multi-agent LLMs#Dataset Creation#Movie Understanding#Cinematic Content#Agentic Enhancement

[논문리뷰] FastMesh:Efficient Artistic Mesh Generation via Component Decoupling

Xingang Pan이 [arXiv]에 게시한 'FastMesh:Efficient Artistic Mesh Generation via Component Decoupling' 논문에 대한 자세한 리뷰입니다.
#Review#3D Mesh Generation#Component Decoupling#Autoregressive Models#Bidirectional Transformer#Fidelity Enhancement#Prediction Filtering#Token Efficiency#Artistic Meshes

[논문리뷰] Autoregressive Universal Video Segmentation Model

Albert Gu이 [arXiv]에 게시한 'Autoregressive Universal Video Segmentation Model' 논문에 대한 자세한 리뷰입니다.
#Review#Video Segmentation#Autoregressive Model#Universal Model#State Space Models#Mamba#Parallel Training#Streaming Video#Deep Learning

[논문리뷰] UQ: Assessing Language Models on Unsolved Questions

Wei Liu이 [arXiv]에 게시한 'UQ: Assessing Language Models on Unsolved Questions' 논문에 대한 자세한 리뷰입니다.
#Review#LLM Evaluation#Unsolved Questions#AI Benchmark#Oracle-Free Validation#Generator-Validator Gap#Community Evaluation#Stack Exchange

[논문리뷰] SpotEdit: Evaluating Visually-Guided Image Editing Methods

Ersin Yumer이 [arXiv]에 게시한 'SpotEdit: Evaluating Visually-Guided Image Editing Methods' 논문에 대한 자세한 리뷰입니다.
#Review#Visually-Guided Image Editing#Multimodal Models#Benchmark#Hallucination#Diffusion Models#Autoregressive Models#Evaluation Metrics

[논문리뷰] ST-Raptor: LLM-Powered Semi-Structured Table Question Answering

Wei Zhou이 [arXiv]에 게시한 'ST-Raptor: LLM-Powered Semi-Structured Table Question Answering' 논문에 대한 자세한 리뷰입니다.
#Review#Semi-structured Tables#Question Answering#LLMs#Hierarchical Orthogonal Tree#Table Layout Understanding#Pipeline Generation#Verification Mechanism

[논문리뷰] MV-RAG: Retrieval Augmented Multiview Diffusion

sagiebenaim이 [arXiv]에 게시한 'MV-RAG: Retrieval Augmented Multiview Diffusion' 논문에 대한 자세한 리뷰입니다.
#Review#Retrieval Augmented Generation#Multiview Diffusion#Text-to-3D Generation#Out-of-Domain#Image Retrieval#3D Consistency#Diffusion Models#Hybrid Training

[논문리뷰] Limitations of Normalization in Attention Mechanism

Radu State이 [arXiv]에 게시한 'Limitations of Normalization in Attention Mechanism' 논문에 대한 자세한 리뷰입니다.
#Review#Attention Mechanism#Normalization#Softmax#Transformer Models#Gradient Sensitivity#Token Separability#Context Length#GPT-2

[논문리뷰] Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning

Jiale Zhao이 [arXiv]에 게시한 'Breaking the Exploration Bottleneck: Rubric-Scaffolded Reinforcement Learning for General LLM Reasoning' 논문에 대한 자세한 리뷰입니다.
#Review#Reinforcement Learning#Large Language Models#Exploration Bottleneck#Instructional Scaffolding#Rubric-based Rewards#General Reasoning#RL with Verifiable Rewards#Policy Optimization

[논문리뷰] EgoTwin: Dreaming Body and View in First Person

Wentao Wang이 [arXiv]에 게시한 'EgoTwin: Dreaming Body and View in First Person' 논문에 대한 자세한 리뷰입니다.
#Review#Egocentric Video Generation#Human Motion Synthesis#Diffusion Transformers#Multimodal Generation#Viewpoint Alignment#Causal Interplay#First-Person Vision

[논문리뷰] CRISP: Persistent Concept Unlearning via Sparse Autoencoders

Yonatan Belinkov이 [arXiv]에 게시한 'CRISP: Persistent Concept Unlearning via Sparse Autoencoders' 논문에 대한 자세한 리뷰입니다.
#Review#Concept Unlearning#Sparse Autoencoders (SAEs)#LLMs#Parameter-Efficient Fine-Tuning#Model Interpretability#Safety-Critical AI#Feature Suppression#WMDP Benchmark

[논문리뷰] Waver: Wave Your Way to Lifelike Video Generation

Yifu Zhang이 [arXiv]에 게시한 'Waver: Wave Your Way to Lifelike Video Generation' 논문에 대한 자세한 리뷰입니다.
#Review#Video Generation#Foundation Model#Diffusion Model#Transformer#Text-to-Video#Image-to-Video#Super-Resolution#Data Curation

[논문리뷰] Mobile-Agent-v3: Foundamental Agents for GUI Automation

Haowei Liu이 [arXiv]에 게시한 'Mobile-Agent-v3: Foundamental Agents for GUI Automation' 논문에 대한 자세한 리뷰입니다.
#Review#GUI Automation#Multimodal Agents#Foundational Models#Reinforcement Learning#Large Language Models#Cross-Platform#Self-Supervised Learning

[논문리뷰] Intern-S1: A Scientific Multimodal Foundation Model

xuhuang87이 [arXiv]에 게시한 'Intern-S1: A Scientific Multimodal Foundation Model' 논문에 대한 자세한 리뷰입니다.
#Review#Multimodal Foundation Model#Scientific AI#Reinforcement Learning#Mixture-of-Experts (MoE)#Dynamic Tokenizer#Data Curation#Low-Resource Learning

[논문리뷰] INTIMA: A Benchmark for Human-AI Companionship Behavior

Yacine Jernite이 [arXiv]에 게시한 'INTIMA: A Benchmark for Human-AI Companionship Behavior' 논문에 대한 자세한 리뷰입니다.
#Review#AI Companionship#Benchmark#Language Models (LLMs)#Human-AI Interaction#Emotional AI#Boundary Setting#Psychological Frameworks#Evaluation Metrics

[논문리뷰] Deep Think with Confidence

Xuewei Wang이 [arXiv]에 게시한 'Deep Think with Confidence' 논문에 대한 자세한 리뷰입니다.
#Review#LLM Reasoning#Confidence Filtering#Self-Consistency#Test-Time Optimization#Computational Efficiency#Adaptive Sampling#Early Stopping#Majority Voting

[논문리뷰] A Survey on Large Language Model Benchmarks

Siyi Li이 [arXiv]에 게시한 'A Survey on Large Language Model Benchmarks' 논문에 대한 자세한 리뷰입니다.
#Review#LLM Benchmarks#Evaluation#Systematic Review#General Capabilities#Domain-Specific Benchmarks#Target-Specific Benchmarks#Data Contamination#AI Ethics

[논문리뷰] RynnEC: Bringing MLLMs into Embodied World

jiangpinliu이 [arXiv]에 게시한 'RynnEC: Bringing MLLMs into Embodied World' 논문에 대한 자세한 리뷰입니다.
#Review#Multi-modal Large Language Models#Embodied AI#Embodied Cognition#Video Understanding#Instance Segmentation#Spatial Reasoning#Robotics

[논문리뷰] On-Policy RL Meets Off-Policy Experts: Harmonizing Supervised Fine-Tuning and Reinforcement Learning via Dynamic Weighting

Guoyin Wang이 [arXiv]에 게시한 'On-Policy RL Meets Off-Policy Experts: Harmonizing Supervised Fine-Tuning and Reinforcement Learning via Dynamic Weighting' 논문에 대한 자세한 리뷰입니다.
#Review#Large Language Models#Reinforcement Learning#Supervised Fine-Tuning#On-Policy RL#Off-Policy Experts#Dynamic Weighting#LLM Alignment#Reasoning

[논문리뷰] Local Scale Equivariance with Latent Deep Equilibrium Canonicalizer

Jeremiah Jiang이 [arXiv]에 게시한 'Local Scale Equivariance with Latent Deep Equilibrium Canonicalizer' 논문에 대한 자세한 리뷰입니다.
#Review#Scale Equivariance#Deep Equilibrium Models#Canonicalization#Computer Vision#Image Classification#Semantic Segmentation#Latent Representation#Monotone Scaling

[논문리뷰] Semantic IDs for Joint Generative Search and Recommendation

Enrico Palumbo이 [arXiv]에 게시한 'Semantic IDs for Joint Generative Search and Recommendation' 논문에 대한 자세한 리뷰입니다.
#Review#Generative Models#Search and Recommendation#Semantic IDs#Bi-Encoder#Quantization#Multi-Task Learning#Retrieval Augmented Generation

[논문리뷰] Prompt Orchestration Markup Language

Yuqing Yang이 [arXiv]에 게시한 'Prompt Orchestration Markup Language' 논문에 대한 자세한 리뷰입니다.
#Review#Prompt Engineering#Large Language Models#Markup Language#Structured Prompting#IDE Support#Multimodal Data#Styling System#Development Toolkit

[논문리뷰] OmniTry: Virtual Try-On Anything without Masks

Xiaoduan Feng이 [arXiv]에 게시한 'OmniTry: Virtual Try-On Anything without Masks' 논문에 대한 자세한 리뷰입니다.
#Review#Virtual Try-On#Diffusion Model#Mask-Free#Image Inpainting#ID Consistency#Wearable Objects#Generative AI

[논문리뷰] CorrSteer: Steering Improves Task Performance and Safety in LLMs through Correlation-based Sparse Autoencoder Feature Selection

Adriano Koshiyama이 [arXiv]에 게시한 'CorrSteer: Steering Improves Task Performance and Safety in LLMs through Correlation-based Sparse Autoencoder Feature Selection' 논문에 대한 자세한 리뷰입니다.
#Review#Sparse Autoencoders#LLM Steering#Feature Selection#Correlation Analysis#AI Safety#Bias Mitigation#Mechanistic Interpretability

[논문리뷰] CAMAR: Continuous Actions Multi-Agent Routing

Alexey Skrynnik이 [arXiv]에 게시한 'CAMAR: Continuous Actions Multi-Agent Routing' 논문에 대한 자세한 리뷰입니다.
#Review#Multi-Agent Reinforcement Learning#Continuous Control#Pathfinding#MARL Benchmark#GPU Acceleration#Robotics Simulation#Scalability#Heterogeneous Agents

[논문리뷰] Reinforcement Learning with Rubric Anchors

Haokai Xu이 [arXiv]에 게시한 'Reinforcement Learning with Rubric Anchors' 논문에 대한 자세한 리뷰입니다.
#Review#Reinforcement Learning#Large Language Models#Rubric-based Reward#RLVR Extension#Human-centric AI#Controllable Generation#Reward Hacking Mitigation

[논문리뷰] Precise Action-to-Video Generation Through Visual Action Prompts

Minghan Qin이 [arXiv]에 게시한 'Precise Action-to-Video Generation Through Visual Action Prompts' 논문에 대한 자세한 리뷰입니다.
#Review#Action-to-Video Generation#Visual Action Prompts#Skeleton Representation#Human-Object Interaction#Robotic Manipulation#Cross-Domain Transfer#Diffusion Models

[논문리뷰] Ovis2.5 Technical Report

Yang Li이 [arXiv]에 게시한 'Ovis2.5 Technical Report' 논문에 대한 자세한 리뷰입니다.
#Review#Multimodal LLMs#Native Resolution Vision#Deep Reasoning#Chart Analysis#OCR#Visual Grounding#Training Efficiency#Preference Optimization

[논문리뷰] Next Visual Granularity Generation

Kang Liao이 [arXiv]에 게시한 'Next Visual Granularity Generation' 논문에 대한 자세한 리뷰입니다.
#Review#Image Generation#Granularity Control#Structured Representation#Hierarchical Generation#Coarse-to-fine#Visual Tokenization#Latent Space

[논문리뷰] 4DNeX: Feed-Forward 4D Generative Modeling Made Easy

Zeng Tao이 [arXiv]에 게시한 '4DNeX: Feed-Forward 4D Generative Modeling Made Easy' 논문에 대한 자세한 리뷰입니다.
#Review#4D Generation#Dynamic 3D#Generative Models#Diffusion Models#Single Image Input#Video Synthesis#Point Clouds#Dataset

[논문리뷰] X-Node: Self-Explanation is All We Need

Islem Rekik이 [arXiv]에 게시한 'X-Node: Self-Explanation is All We Need' 논문에 대한 자세한 리뷰입니다.
#Review#Graph Neural Networks#Explainable AI#Self-Explanation#Node Classification#Medical Imaging#Natural Language Processing#Interpretability

[논문리뷰] Thyme: Think Beyond Images

Wei Chen이 [arXiv]에 게시한 'Thyme: Think Beyond Images' 논문에 대한 자세한 리뷰입니다.
#Review#Multimodal LLMs#Code Generation#Image Processing#Reinforcement Learning#Supervised Fine-Tuning#Visual Reasoning#Sandbox

[논문리뷰] SSRL: Self-Search Reinforcement Learning

Yanxu Chen이 [arXiv]에 게시한 'SSRL: Self-Search Reinforcement Learning' 논문에 대한 자세한 리뷰입니다.
#Review#Reinforcement Learning#Large Language Models#Self-Search#Sim-to-Real Transfer#Agentic AI#Knowledge Retrieval#Reward Modeling

[논문리뷰] DINOv3

Maxime Oquab이 [arXiv]에 게시한 'DINOv3' 논문에 대한 자세한 리뷰입니다.
#Review#Self-supervised Learning#Foundation Models#Vision Transformer#Dense Feature Maps#Gram Anchoring#Model Distillation#Geospatial AI

[논문리뷰] Controlling Multimodal LLMs via Reward-guided Decoding

Michal Drozdzal이 [arXiv]에 게시한 'Controlling Multimodal LLMs via Reward-guided Decoding' 논문에 대한 자세한 리뷰입니다.
#Review#Multimodal LLMs#Reward Models#Guided Decoding#Visual Grounding#Hallucination Mitigation#Object Precision#Object Recall#Inference-time Control

[논문리뷰] From Black Box to Transparency: Enhancing Automated Interpreting Assessment with Explainable AI in College Classrooms

Ziyin Zhang이 [arXiv]에 게시한 'From Black Box to Transparency: Enhancing Automated Interpreting Assessment with Explainable AI in College Classrooms' 논문에 대한 자세한 리뷰입니다.
#Review#Automated Interpreting Assessment#Explainable AI#Data Augmentation#Variational Autoencoder#SHAP#Interpreting Quality#Natural Language Processing

[논문리뷰] A Survey on Diffusion Language Models

Zhiqiang Shen이 [arXiv]에 게시한 'A Survey on Diffusion Language Models' 논문에 대한 자세한 리뷰입니다.
#Review#Diffusion Language Models#Generative AI#Parallel Decoding#Text Generation#Multimodal AI#Model Compression#Reinforcement Learning from Human Feedback#Inference Optimization

[논문리뷰] When Explainability Meets Privacy: An Investigation at the Intersection of Post-hoc Explainability and Differential Privacy in the Context of Natural Language Processing

Gjergji Kasneci이 [arXiv]에 게시한 'When Explainability Meets Privacy: An Investigation at the Intersection of Post-hoc Explainability and Differential Privacy in the Context of Natural Language Processing' 논문에 대한 자세한 리뷰입니다.
#Review#Natural Language Processing (NLP)#Explainable AI (XAI)#Post-hoc Explainability#Differential Privacy (DP)#Privacy-Utility Trade-off#Model Faithfulness#Text Privatization

[논문리뷰] WGAST: Weakly-Supervised Generative Network for Daily 10 m Land Surface Temperature Estimation via Spatio-Temporal Fusion

Rachid Nedjai이 [arXiv]에 게시한 'WGAST: Weakly-Supervised Generative Network for Daily 10 m Land Surface Temperature Estimation via Spatio-Temporal Fusion' 논문에 대한 자세한 리뷰입니다.
#Review#Spatio-Temporal Fusion#Land Surface Temperature#Generative Adversarial Network#Weakly-Supervised Learning#Remote Sensing#Deep Learning

[논문리뷰] Train Long, Think Short: Curriculum Learning for Efficient Reasoning

Marzyeh Ghassemi이 [arXiv]에 게시한 'Train Long, Think Short: Curriculum Learning for Efficient Reasoning' 논문에 대한 자세한 리뷰입니다.
#Review#Curriculum Learning#Reinforcement Learning#Large Language Models#Reasoning Efficiency#Token Budget Control#Group Relative Policy Optimization#Chain-of-Thought

[논문리뷰] OpenCUA: Open Foundations for Computer-Use Agents

Tianbao Xie이 [arXiv]에 게시한 'OpenCUA: Open Foundations for Computer-Use Agents' 논문에 대한 자세한 리뷰입니다.
#Review#Computer-Use Agents#Vision-Language Models#Chain-of-Thought Reasoning#Large-scale Dataset#Open-source Framework#Desktop Automation#Agent Evaluation

[논문리뷰] Cut2Next: Generating Next Shot via In-Context Tuning

Yu Qiao이 [arXiv]에 게시한 'Cut2Next: Generating Next Shot via In-Context Tuning' 논문에 대한 자세한 리뷰입니다.
#Review#Next Shot Generation#In-Context Tuning#Diffusion Transformer#Cinematic Continuity#Hierarchical Prompting#Video Generation#Shot Editing

[논문리뷰] Bridging Theory and Practice in Quantum Game Theory: Optimized Implementation of the Battle of the Sexes with Error Mitigation on NISQ Hardware

Jhon Alejandro Andrade이 [arXiv]에 게시한 'Bridging Theory and Practice in Quantum Game Theory: Optimized Implementation of the Battle of the Sexes with Error Mitigation on NISQ Hardware' 논문에 대한 자세한 리뷰입니다.
#Review#Quantum Game Theory#NISQ Hardware#Error Mitigation#Battle of the Sexes#Qiskit#Quantum Computing#Strategic Coordination#Payoff Maximization

[논문리뷰] Aryabhata: An exam-focused language model for JEE Math

Sandeep Varma이 [arXiv]에 게시한 'Aryabhata: An exam-focused language model for JEE Math' 논문에 대한 자세한 리뷰입니다.
#Review#Language Model#Math Reasoning#JEE#Supervised Fine-Tuning#Reinforcement Learning#Model Merging#Chain-of-Thought#Curriculum Learning

[논문리뷰] Reinforcement Learning in Vision: A Survey

Qingwei Meng이 [arXiv]에 게시한 'Reinforcement Learning in Vision: A Survey' 논문에 대한 자세한 리뷰입니다.
#Review#Reinforcement Learning (RL)#Computer Vision (CV)#Multimodal Large Language Models (MLLMs)#Visual Generation#Vision-Language-Action (VLA) Models#Policy Optimization#Reward Modeling

[논문리뷰] A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems

Xinhao Yi이 [arXiv]에 게시한 'A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems' 논문에 대한 자세한 리뷰입니다.
#Review#Self-Evolving AI Agents#Lifelong Learning#Foundation Models#Multi-Agent Systems#Agent Optimization#Prompt Engineering#Tool Use#AI Safety#Survey

[논문리뷰] Memp: Exploring Agent Procedural Memory

Shuofei Qiao이 [arXiv]에 게시한 'Memp: Exploring Agent Procedural Memory' 논문에 대한 자세한 리뷰입니다.
#Review#Procedural Memory#LLM Agents#Memory Management#Task Automation#Lifelong Learning#Experience Replay#Agent Learning

[논문리뷰] GENIE: Gaussian Encoding for Neural Radiance Fields Interactive Editing

Przemysław Spurek이 [arXiv]에 게시한 'GENIE: Gaussian Encoding for Neural Radiance Fields Interactive Editing' 논문에 대한 자세한 리뷰입니다.
#Review#Neural Radiance Fields (NeRF)#Gaussian Splatting (GS)#Interactive Editing#3D Scene Representation#Physics Simulation#Hybrid Model#Real-time Rendering#Ray Tracing

[논문리뷰] Visual Document Understanding and Question Answering: A Multi-Agent Collaboration Framework with Test-Time Scaling

Ruolin Shen이 [arXiv]에 게시한 'Visual Document Understanding and Question Answering: A Multi-Agent Collaboration Framework with Test-Time Scaling' 논문에 대한 자세한 리뷰입니다.
#Review#Visual Document Understanding#Visual Question Answering#Multi-Agent System#Test-Time Scaling#Self-Correction#Mixed Reward Modeling#Large Language Models

[논문리뷰] R-Zero: Self-Evolving Reasoning LLM from Zero Data

Zongxia Li이 [arXiv]에 게시한 'R-Zero: Self-Evolving Reasoning LLM from Zero Data' 논문에 대한 자세한 리뷰입니다.
#Review#Self-Evolving LLM#Reinforcement Learning#Curriculum Learning#Reasoning#Large Language Models#Self-Play#Zero-Data Training

[논문리뷰] Marco-Voice Technical Report

Qingjuan Li이 [arXiv]에 게시한 'Marco-Voice Technical Report' 논문에 대한 자세한 리뷰입니다.
#Review#Speech Synthesis#Voice Cloning#Emotion Control#Text-to-Speech#Disentanglement#Contrastive Learning#Flow Matching#Emotional Speech Dataset

[논문리뷰] DeepPHY: Benchmarking Agentic VLMs on Physical Reasoning

Ziming Wang이 [arXiv]에 게시한 'DeepPHY: Benchmarking Agentic VLMs on Physical Reasoning' 논문에 대한 자세한 리뷰입니다.
#Review#Vision Language Models (VLMs)#Agentic AI#Physical Reasoning#Benchmark#Simulation Environments#Action Planning#Interactive AI

[논문리뷰] Can Large Multimodal Models Actively Recognize Faulty Inputs? A Systematic Evaluation Framework of Their Input Scrutiny Ability

Yuan Wu이 [arXiv]에 게시한 'Can Large Multimodal Models Actively Recognize Faulty Inputs? A Systematic Evaluation Framework of Their Input Scrutiny Ability' 논문에 대한 자세한 리뷰입니다.
#Review#Large Multimodal Models#Input Scrutiny#Error Detection#Faulty Inputs#Evaluation Framework#Modality Preference#Cross-Modal Inconsistency

[논문리뷰] Are Today's LLMs Ready to Explain Well-Being Concepts?

Huan Liu이 [arXiv]에 게시한 'Are Today's LLMs Ready to Explain Well-Being Concepts?' 논문에 대한 자세한 리뷰입니다.
#Review#Large Language Models#Well-being Concepts#LLM Evaluation#Principle-Guided Evaluation#LLM-as-a-Judge#Supervised Fine-Tuning (SFT)#Direct Preference Optimization (DPO)#Explanation Generation

[논문리뷰] Sotopia-RL: Reward Design for Social Intelligence

Keyang Xuan이 [arXiv]에 게시한 'Sotopia-RL: Reward Design for Social Intelligence' 논문에 대한 자세한 리뷰입니다.
#Review#Social Intelligence#Reinforcement Learning#Reward Design#Large Language Models#Utterance-level Rewards#Multi-dimensional Rewards#Partial Observability#SOTOPIA

[논문리뷰] RL-PLUS: Countering Capability Boundary Collapse of LLMs in Reinforcement Learning with Hybrid-policy Optimization

Kechi Zhang이 [arXiv]에 게시한 'RL-PLUS: Countering Capability Boundary Collapse of LLMs in Reinforcement Learning with Hybrid-policy Optimization' 논문에 대한 자세한 리뷰입니다.
#Review#Large Language Models#Reinforcement Learning#Capability Collapse#Hybrid Policy Optimization#Multiple Importance Sampling#Exploration#Math Reasoning#Out-of-Distribution

[논문리뷰] IAUNet: Instance-Aware U-Net

Dmytro Fishman이 [arXiv]에 게시한 'IAUNet: Instance-Aware U-Net' 논문에 대한 자세한 리뷰입니다.
#Review#Instance Segmentation#U-Net#Query-based Model#Transformer Decoder#Biomedical Imaging#Cell Segmentation#Deep Learning

[논문리뷰] HPSv3: Towards Wide-Spectrum Human Preference Score

Hongsheng Li이 [arXiv]에 게시한 'HPSv3: Towards Wide-Spectrum Human Preference Score' 논문에 대한 자세한 리뷰입니다.
#Review#Human Preference Score#Text-to-Image Generation#Image Evaluation#Vision-Language Models (VLMs)#Uncertainty-Aware Ranking Loss#Dataset#Iterative Refinement#Chain-of-Thought

[논문리뷰] Agent Lightning: Train ANY AI Agents with Reinforcement Learning

Zilong Wang이 [arXiv]에 게시한 'Agent Lightning: Train ANY AI Agents with Reinforcement Learning' 논문에 대한 자세한 리뷰입니다.
#Review#Reinforcement Learning#Large Language Models#AI Agents#Framework#Markov Decision Process#Hierarchical RL#Training-Agent Disaggregation#Observability

[논문리뷰] Tool-integrated Reinforcement Learning for Repo Deep Search

Yanzhen Zou이 [arXiv]에 게시한 'Tool-integrated Reinforcement Learning for Repo Deep Search' 논문에 대한 자세한 리뷰입니다.
#Review#Issue Localization#Large Language Models (LLMs)#Reinforcement Learning (RL)#Supervised Fine-tuning (SFT)#Tool-integrated Agents#Software Engineering#Code Search

[논문리뷰] TRACEALIGN -- Tracing the Drift: Attributing Alignment Failures to Training-Time Belief Sources in LLMs

Aman Chadha이 [arXiv]에 게시한 'TRACEALIGN -- Tracing the Drift: Attributing Alignment Failures to Training-Time Belief Sources in LLMs' 논문에 대한 자세한 리뷰입니다.
#Review#LLM Alignment#Alignment Drift#Training Data Provenance#Belief Conflict Index (BCI)#Suffix Array#Safety Interventions#Reinforcement Learning from Human Feedback#Explainable AI

[논문리뷰] Multi-human Interactive Talking Dataset

Mike Zheng Shou이 [arXiv]에 게시한 'Multi-human Interactive Talking Dataset' 논문에 대한 자세한 리뷰입니다.
#Review#Multi-human Video Generation#Interactive Talking#Dataset#Audio-driven Animation#Pose Control#Speech Interaction#Diffusion Models

[논문리뷰] LongVie: Multimodal-Guided Controllable Ultra-Long Video Generation

Chenyang Si이 [arXiv]에 게시한 'LongVie: Multimodal-Guided Controllable Ultra-Long Video Generation' 논문에 대한 자세한 리뷰입니다.
#Review#Ultra-long Video Generation#Multimodal Guidance#Controllable Video Generation#Diffusion Models#Temporal Consistency#Visual Quality#Autoregressive Generation#Degradation-aware Training

[논문리뷰] ChartCap: Mitigating Hallucination of Dense Chart Captioning

Gunhee Kim이 [arXiv]에 게시한 'ChartCap: Mitigating Hallucination of Dense Chart Captioning' 논문에 대한 자세한 리뷰입니다.
#Review#Chart Captioning#Hallucination Mitigation#Dataset Generation#Visual Language Models#Cycle Consistency#Reference-Free Metric#Data Visualization

[논문리뷰] AlignGuard-LoRA: Alignment-Preserving Fine-Tuning via Fisher-Guided Decomposition and Riemannian-Geodesic Collision Regularization

Aman Chadha이 [arXiv]에 게시한 'AlignGuard-LoRA: Alignment-Preserving Fine-Tuning via Fisher-Guided Decomposition and Riemannian-Geodesic Collision Regularization' 논문에 대한 자세한 리뷰입니다.
#Review#Alignment Preservation#Fine-Tuning#LoRA#Fisher Information Matrix#Catastrophic Forgetting#LLM Safety#Riemannian Geometry#Parameter-Efficient Learning

[논문리뷰] Qwen-Image Technical Report

Kaiyuan Gao이 [arXiv]에 게시한 'Qwen-Image Technical Report' 논문에 대한 자세한 리뷰입니다.
#Review#Image Generation#Text-to-Image#Image Editing#Text Rendering#Multimodal Diffusion Transformer#Curriculum Learning#Reinforcement Learning#Foundation Model

[논문리뷰] Personalized Safety Alignment for Text-to-Image Diffusion Models

Kaidong Yu이 [arXiv]에 게시한 'Personalized Safety Alignment for Text-to-Image Diffusion Models' 논문에 대한 자세한 리뷰입니다.
#Review#Personalized Safety Alignment#Text-to-Image Diffusion Models#DPO#User Preferences#Content Moderation#Generative AI#Cross-Attention#Safety Alignment

[논문리뷰] Exploitation Is All You Need... for Exploration

Jesse Roberts이 [arXiv]에 게시한 'Exploitation Is All You Need... for Exploration' 논문에 대한 자세한 리뷰입니다.
#Review#Reinforcement Learning#Exploration-Exploitation#Meta-RL#Transformer Architecture#Emergent Behavior#Multi-Armed Bandits#Gridworlds#Pseudo-Thompson Sampling

[논문리뷰] CellForge: Agentic Design of Virtual Cell Models

Daniel Shao이 [arXiv]에 게시한 'CellForge: Agentic Design of Virtual Cell Models' 논문에 대한 자세한 리뷰입니다.
#Review#AI Scientist#Multi-Agent System#Virtual Cell Modeling#Single-Cell Perturbation Prediction#Deep Learning#Automated Model Design#Code Generation#Retrieval-Augmented Generation

[논문리뷰] SWE-Exp: Experience-Driven Software Issue Resolution

Heng Lian이 [arXiv]에 게시한 'SWE-Exp: Experience-Driven Software Issue Resolution' 논문에 대한 자세한 리뷰입니다.
#Review#Software Issue Resolution#LLM Agents#Experience-Driven Learning#Automated Program Repair#Multi-Agent Systems#Knowledge Management#Continuous Learning

[논문리뷰] PixNerd: Pixel Neural Field Diffusion

Limin Wang이 [arXiv]에 게시한 'PixNerd: Pixel Neural Field Diffusion' 논문에 대한 자세한 리뷰입니다.
#Review#Diffusion Models#Neural Fields#Pixel Space#Generative Models#Image Synthesis#Transformer Architecture#End-to-End Learning

[논문리뷰] Multimodal Referring Segmentation: A Survey

Zuxuan Wu이 [arXiv]에 게시한 'Multimodal Referring Segmentation: A Survey' 논문에 대한 자세한 리뷰입니다.
#Review#Multimodal Learning#Referring Segmentation#Vision-Language Models#Image Segmentation#Video Segmentation#3D Vision#Survey

[논문리뷰] Learning an Efficient Multi-Turn Dialogue Evaluator from Multiple Judges

Chengfei Lv이 [arXiv]에 게시한 'Learning an Efficient Multi-Turn Dialogue Evaluator from Multiple Judges' 논문에 대한 자세한 리뷰입니다.
#Review#Multi-Turn Dialogue Evaluation#LLM-as-a-Judge#Multi-Judge Aggregation#Preference Learning#Dialogue Quality Assessment#Maximum Likelihood Estimation#Computational Efficiency

[논문리뷰] iLRM: An Iterative Large 3D Reconstruction Model

Abdelrahman Mohamed이 [arXiv]에 게시한 'iLRM: An Iterative Large 3D Reconstruction Model' 논문에 대한 자세한 리뷰입니다.
#Review#3D Reconstruction#Gaussian Splatting#Iterative Refinement#Transformer Architecture#Multi-view Learning#Scalability#Feed-forward Models

[논문리뷰] Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving

Zhicheng Jiang이 [arXiv]에 게시한 'Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving' 논문에 대한 자세한 리뷰입니다.
#Review#Automated Theorem Proving#Large Language Models#Formal Verification#Reinforcement Learning#Lean#Geometry Reasoning#Chain-of-Thought#Lemma-Style Proving

[논문리뷰] RecGPT Technical Report

Jian Wu이 [arXiv]에 게시한 'RecGPT Technical Report' 논문에 대한 자세한 리뷰입니다.
#Review#Recommender Systems#Large Language Models (LLMs)#User Intent Modeling#Multi-Stage Training#Human-in-the-Loop#E-commerce#Filter Bubble Mitigation#Matthew Effect

[논문리뷰] NeRF Is a Valuable Assistant for 3D Gaussian Splatting

ZeSheng Wang이 [arXiv]에 게시한 'NeRF Is a Valuable Assistant for 3D Gaussian Splatting' 논문에 대한 자세한 리뷰입니다.
#Review#NeRF#3D Gaussian Splatting#Hybrid Model#Joint Optimization#Scene Representation#Neural Rendering#Residual Learning#Sparse View

[논문리뷰] Flow Equivariant Recurrent Neural Networks

T. Anderson Keller이 [arXiv]에 게시한 'Flow Equivariant Recurrent Neural Networks' 논문에 대한 자세한 리뷰입니다.
#Review#Flow Equivariance#Recurrent Neural Networks#Sequence Models#Group Equivariance#Lie Subgroups#Generalization#Time-Parameterized Symmetries

[논문리뷰] Enhanced Arabic Text Retrieval with Attentive Relevance Scoring

Abdenour Hadid이 [arXiv]에 게시한 'Enhanced Arabic Text Retrieval with Attentive Relevance Scoring' 논문에 대한 자세한 리뷰입니다.
#Review#Arabic NLP#Dense Passage Retrieval#Attentive Relevance Scoring#Information Retrieval#Question Answering#Transformer Models#Semantic Matching

[논문리뷰] Efficient Machine Unlearning via Influence Approximation

Enhong Chen이 [arXiv]에 게시한 'Efficient Machine Unlearning via Influence Approximation' 논문에 대한 자세한 리뷰입니다.
#Review#Machine Unlearning#Influence Function#Incremental Learning#Privacy Protection#Gradient Optimization#Model Editing#Computational Efficiency

[논문리뷰] Beyond Linear Bottlenecks: Spline-Based Knowledge Distillation for Culturally Diverse Art Style Classification

Abdelmalik Taleb-Ahmed이 [arXiv]에 게시한 'Beyond Linear Bottlenecks: Spline-Based Knowledge Distillation for Culturally Diverse Art Style Classification' 논문에 대한 자세한 리뷰입니다.
#Review#Kolmogorov-Arnold Networks#Knowledge Distillation#Art Style Classification#Self-Supervised Learning#Spline-Based Activation#Dual-Teacher#Gram Matrix